Diagrams & Models
DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization
Lv, Zheqi, Zhang, Wenqiao, Zhang, Shengyu, Kuang, Kun, Wang, Feng, Wang, Yongwei, Chen, Zhengyu, Shen, Tao, Yang, Hongxia, Ooi, Beng Chin, Wu, Fei
Device Model Generalization (DMG) is a practical yet under-investigated research topic for on-device machine learning applications. It aims to improve the generalization ability of pre-trained models when deployed on resource-constrained devices, such as improving the performance of pre-trained cloud models on smart mobiles. While quite a lot of works have investigated the data distribution shift across clouds and devices, most of them focus on model fine-tuning on personalized data for individual devices to facilitate DMG. Despite their promising, these approaches require on-device re-training, which is practically infeasible due to the overfitting problem and high time delay when performing gradient calculation on real-time data. In this paper, we argue that the computational cost brought by fine-tuning can be rather unnecessary. We consequently present a novel perspective to improving DMG without increasing computational cost, i.e., device-specific parameter generation which directly maps data distribution to parameters. Specifically, we propose an efficient Device-cloUd collaborative parametErs generaTion framework DUET. DUET is deployed on a powerful cloud server that only requires the low cost of forwarding propagation and low time delay of data transmission between the device and the cloud. By doing so, DUET can rehearse the device-specific model weight realizations conditioned on the personalized real-time data for an individual device. Importantly, our DUET elegantly connects the cloud and device as a 'duet' collaboration, frees the DMG from fine-tuning, and enables a faster and more accurate DMG paradigm. We conduct an extensive experimental study of DUET on three public datasets, and the experimental results confirm our framework's effectiveness and generalisability for different DMG tasks.
Publishing Efficient On-device Models Increases Adversarial Vulnerability
Hong, Sanghyun, Carlini, Nicholas, Kurakin, Alexey
Recent increases in the computational demands of deep neural networks (DNNs) have sparked interest in efficient deep learning mechanisms, e.g., quantization or pruning. These mechanisms enable the construction of a small, efficient version of commercial-scale models with comparable accuracy, accelerating their deployment to resource-constrained devices. In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. In evaluations across 19 DNNs, by exploiting the published on-device models as a transfer prior, the adversarial vulnerability of the original commercial-scale models increases by up to 100x. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase. Based on the insights, we propose a defense, $similarity$-$unpairing$, that fine-tunes on-device models with the objective of reducing the similarity. We evaluated our defense on all the 19 DNNs and found that it reduces the transferability up to 90% and the number of queries required by a factor of 10-100x. Our results suggest that further research is needed on the security (or even privacy) threats caused by publishing those efficient siblings.
Robust Federated Learning for execution time-based device model identification under label-flipping attack
Sánchez, Pedro Miguel Sánchez, Celdrán, Alberto Huertas, Rubio, José Rafael Buendía, Bovet, Gérôme, Pérez, Gregorio Martínez
The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, several solutions have emerged to identify device models and types based on the combination of behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques. However, these solutions are not appropriated for scenarios where data privacy and protection is a must, as they require data centralization for processing. In this context, newer approaches such as Federated Learning (FL) have not been fully explored yet, especially when malicious clients are present in the scenario setup. The present work analyzes and compares the device model identification performance of a centralized DL model with an FL one while using execution time-based events. For experimental purposes, a dataset containing execution-time features of 55 Raspberry Pis belonging to four different models has been collected and published. Using this dataset, the proposed solution achieved 0.9999 accuracy in both setups, centralized and federated, showing no performance decrease while preserving data privacy. Later, the impact of a label-flipping attack during the federated model training is evaluated, using several aggregation mechanisms as countermeasure. Zeno and coordinate-wise median aggregation show the best performance, although their performance greatly degrades when the percentage of fully malicious clients (all training samples poisoned) grows over 50%.
FedZKT: Zero-Shot Knowledge Transfer towards Heterogeneous On-Device Models in Federated Learning
Federated learning enables distributed devices to collaboratively learn a shared prediction model without centralizing on-device training data. Most of the current algorithms require comparable individual efforts to train on-device models with the same structure and size, impeding participation from resource-constrained devices. Given the widespread yet heterogeneous devices nowadays, this paper proposes a new framework supporting federated learning across heterogeneous on-device models via Zero-shot Knowledge Transfer, named by FedZKT. Specifically, FedZKT allows participating devices to independently determine their on-device models. To transfer knowledge across on-device models, FedZKT develops a zero-shot distillation approach contrary to certain prior research based on a public dataset or a pre-trained data generator. To utmostly reduce on-device workload, the resource-intensive distillation task is assigned to the server, which constructs a generator to adversarially train with the ensemble of the received heterogeneous on-device models. The distilled central knowledge will then be sent back in the form of the corresponding on-device model parameters, which can be easily absorbed at the device side. Experimental studies demonstrate the effectiveness and the robustness of FedZKT towards heterogeneous on-device models and challenging federated learning scenarios, such as non-iid data distribution and straggler effects.
Accounts with default creds found in 100 GE medical device models
More than 100 models of General Electric Healthcare medical devices come with hidden accounts that use the same default credentials and could be abused by hackers to gain access to medical equipment inside hospitals and clinics. Here are some non-medical face masks you can wear back to work. Affected devices include the likes of CT scanners, X-Ray machines, and MRI imaging systems, according to CyberMDX, the security firm that discovered the hidden accounts earlier this year. The accounts, hidden to end-users, are included in the device firmware and are used by GE Healthcare servers to connect to on-premise devices and perform maintenance operations, run system health checks, obtain logs, run updates, and other actions. CyberMDX says the problem with these accounts is that use the same default credentials and that the credentials are public and can also be found online by threat actors, which can then abuse them to gain access to hospital imaging systems and harvest patient personal data.
Towards Personalized Modeling of the Female Hormonal Cycle: Experiments with Mechanistic Models and Gaussian Processes
Urteaga, Iñigo, Albers, David J., Wheeler, Marija Vlajic, Druet, Anna, Raffauf, Hans, Elhadad, Noémie
In this paper, we introduce a novel task for machine learning in healthcare, namely personalized modeling of the female hormonal cycle. The motivation for this work is to model the hormonal cycle and predict its phases in time, both for healthy individuals and for those with disorders of the reproductive system. Because there are individual differences in the menstrual cycle, we are particularly interested in personalized models that can account for individual idiosyncracies, towards identifying phenotypes of menstrual cycles. As a first step, we consider the hormonal cycle as a set of observations through time. We use a previously validated mechanistic model to generate realistic hormonal patterns, and experiment with Gaussian process regression to estimate their values over time. Specifically, we are interested in the feasibility of predicting menstrual cycle phases under varying learning conditions: number of cycles used for training, hormonal measurement noise and sampling rates, and informed vs. agnostic sampling of hormonal measurements. Our results indicate that Gaussian processes can help model the female menstrual cycle. We discuss the implications of our experiments in the context of modeling the female menstrual cycle.
Code reuse exposes over 120 D-Link devices models to hacking
A recently discovered vulnerability in a D-Link network camera that allows attackers to remotely take over the device also exists in more than 120 other D-Link products. The vulnerability was initially discovered a month ago by researchers from security start-up firm Senrio in D-Link DCS-930L, a Wi-Fi enabled camera that can be controlled remotely through a smartphone app. The flaw, a stack overflow, is located in a firmware service called dcp, which listens to commands on port 5978. Attackers can trigger the overflow by sending specifically crafted commands and then can execute rogue code on the system. The Senrio researchers used the flaw to silently change the administrator password for the Web-based management interface, but it could also be used to install malware on the device.
Waveform Driven Plasticity in BiFeO3 Memristive Devices: Model and Implementation
Mayr, Christian, Stärke, Paul, Partzsch, Johannes, Cederstroem, Love, Schüffny, Rene, Shuai, Yao, Du, Nan, Schmidt, Heidemarie
Memristive devices have recently been proposed as efficient implementations of plastic synapses in neuromorphic systems. The plasticity in these memristive devices, i.e. their resistance change, is defined by the applied waveforms. This behavior resembles biological synapses, whose plasticity is also triggered by mechanisms that are determined by local waveforms. However, learning in memristive devices has so far been approached mostly on a pragmatic technological level. The focus seems to be on finding any waveform that achieves spike-timing-dependent plasticity (STDP), without regard to the biological veracity of said waveforms or to further important forms of plasticity. Bridging this gap, we make use of a plasticity model driven by neuron waveforms that explains a large number of experimental observations and adapt it to the characteristics of the recently introduced BiFeO$_3$ memristive material. Based on this approach, we show STDP for the first time for this material, with learning window replication superior to previous memristor-based STDP implementations. We also demonstrate in measurements that it is possible to overlay short and long term plasticity at a memristive device in the form of the well-known triplet plasticity. To the best of our knowledge, this is the first implementations of triplet plasticity on any physical memristive device.
A mechanistic model of early sensory processing based on subtracting sparse representations
Druckmann, Shaul, Hu, Tao, Chklovskii, Dmitri B.
Early stages of sensory systems face the challenge of compressing information from numerous receptors onto a much smaller number of projection neurons, a so called communication bottleneck. To make more efficient use of limited bandwidth, compression may be achieved using predictive coding, whereby predictable, or redundant, components of the stimulus are removed. In the case of the retina, Srinivasan et al. (1982) suggested that feedforward inhibitory connections subtracting a linear prediction generated from nearby receptors implement such compression, resulting in biphasic center-surround receptive fields. However, feedback inhibitory circuits are common in early sensory circuits and furthermore their dynamics may be nonlinear. Can such circuits implement predictive coding as well? Here, solving the transient dynamics of nonlinear reciprocal feedback circuits through analogy to a signal-processing algorithm called linearized Bregman iteration we show that nonlinear predictive coding can be implemented in an inhibitory feedback circuit. In response to a step stimulus, interneuron activity in time constructs progressively less sparse but more accurate representations of the stimulus, a temporally evolving prediction. This analysis provides a powerful theoretical framework to interpret and understand the dynamics of early sensory processing in a variety of physiological experiments and yields novel predictions regarding the relation between activity and stimulus statistics.
Diagrams as Scaffolds for Abductive Insights
Hoffmann, Michael Hans Georg (Georgia Institute of Technology)
Based on a typology of five basic forms of abduction, I propose a new definition of abductive insight that empha sizes in particular the inferential structure of a belief system that is able to explain a phenomenon after a new, abductive ly created component has been added to this system or the entire system has been abductively restructured. My thesis is, first, that the argumentative structure of the pursued problem solution guides abductive creativity and, second, that diagrammatic reasoning—if conceptualized according to the requirements defined by Charles Peirce—can support this guidance. This support is mainly possible based on the normative power of the system of representation that has to be used to construct diagrams and to perform experiments with them.