device model
AI-Guided Codesign Framework for Novel Material and Device Design applied to MTJ-based True Random Number Generators
Patel, Karan P., Maicke, Andrew, Arzate, Jared, Kwon, Jaesuk, Smith, J. Darby, Aimone, James B., Incorvia, Jean Anne C., Cardwell, Suma G., Schuman, Catherine D.
Designing devices for novel applications is oftentimes a time rigorous and resource-constrained process that requires utilizing computationally intensive simulations, device fabrication, and testing of the physical components in the application-specific environment. At the same time, customizing device characteristics to a particular application can allow for significant performance improvements. Automated codesign strategies are becoming increasingly popular with advancements in the artificial intelligence (AI) field that provide useful machine learning algorithms and frameworks [1-4]. Such codesign provides new opportunities to automatically customize devices for application-specific needs to maximize performance--whether that involves a particular capability, energy usage, latency, throughput, or even combinations of metrics. The operation of emerging devices, such as magnetic tunnel junctions (MTJs) [5-8], can be simulated using physics-based models that capture key behaviors based on materials and device properties.
Understanding Data Reconstruction Leakage in Federated Learning from a Theoretical Perspective
Wang, Zifan, Zhang, Binghui, Pang, Meng, Hong, Yuan, Wang, Binghui
The emerging federated learning (FL) [31] has been a great potential to protect data privacy. In FL, the participating devices keep and train their data locally, and only share the trained models (e.g., gradients or parameters), instead of the raw data, with a center server (e.g., cloud). The server updates its global model by aggregating the received device models, and broadcasts the updated global model to all participating devices such that all devices indirectly use all data from other devices. FL has been deployed by many companies such as Google [15], Microsoft [32], IBM [21], Alibaba [2], and applied in various privacy-sensitive applications, including on-device item ranking [31], content suggestions for on-device keyboards [6], next word prediction [27], health monitoring [38], and medical imaging [23]. Unfortunately, recent works show that, though only sharing device models, it is still possible for an adversary (e.g., malicious server) to perform the severe data reconstruction attack (DRA) to FL [57], where an adversary could directly reconstruct the device's training data via the shared device models. Later, a bunch of follow-up enhanced attacks [20, 45, 55, 51, 47, 53, 22, 56, 9, 3, 30, 11, 43, 48, 18, 49, 35]) are proposed by either incorporating (known or unrealistic) prior knowledge or requiring an auxiliary dataset to simulate the training data distribution. However, we note that existing DRA methods have several limitations: First, they are sensitive to the initialization (which is also observed in [47]). For instance, we show in Figure 1 that the attack performance of iDLG [55] and DLG [57] are significantly influenced by initial parameters (i.e., the mean and standard deviation) of a Gaussian distribution, where the initial data is sampled from.
Using the IBM Analog In-Memory Hardware Acceleration Kit for Neural Network Training and Inference
Gallo, Manuel Le, Lammie, Corey, Buechel, Julian, Carta, Fabio, Fagbohungbe, Omobayode, Mackin, Charles, Tsai, Hsinyu, Narayanan, Vijay, Sebastian, Abu, Maghraoui, Kaoutar El, Rasch, Malte J.
Analog In-Memory Computing (AIMC) is a promising approach to reduce the latency and energy consumption of Deep Neural Network (DNN) inference and training. However, the noisy and non-linear device characteristics, and the non-ideal peripheral circuitry in AIMC chips, require adapting DNNs to be deployed on such hardware to achieve equivalent accuracy to digital computing. In this tutorial, we provide a deep dive into how such adaptations can be achieved and evaluated using the recently released IBM Analog Hardware Acceleration Kit (AIHWKit), freely available at https://github.com/IBM/aihwkit. The AIHWKit is a Python library that simulates inference and training of DNNs using AIMC. We present an in-depth description of the AIHWKit design, functionality, and best practices to properly perform inference and training. We also present an overview of the Analog AI Cloud Composer, that provides the benefits of using the AIHWKit simulation platform in a fully managed cloud setting. Finally, we show examples on how users can expand and customize AIHWKit for their own needs. This tutorial is accompanied by comprehensive Jupyter Notebook code examples that can be run using AIHWKit, which can be downloaded from https://github.com/IBM/aihwkit/tree/master/notebooks/tutorial.
One-stop Training of Multiple Capacity Models
Jiang, Lan, Huang, Haoyang, Zhang, Dongdong, Jiang, Rui, Wei, Furu
Training models with varying capacities can be advantageous for deploying them in different scenarios. While high-capacity models offer better performance, low-capacity models require fewer computing resources for training and inference. In this work, we propose a novel one-stop training framework to jointly train high-capacity and low-capactiy models. This framework consists of two composite model architectures and a joint training algorithm called Two-Stage Joint-Training (TSJT). Unlike knowledge distillation, where multiple capacity models are trained from scratch separately, our approach integrates supervisions from different capacity models simultaneously, leading to faster and more efficient convergence. Extensive experiments on the multilingual machine translation benchmark WMT10 show that our method outperforms low-capacity baseline models and achieves comparable or better performance on high-capacity models. Notably, the analysis demonstrates that our method significantly influences the initial training process, leading to more efficient convergence and superior solutions.
Device Modeling Bias in ReRAM-based Neural Network Simulations
Yousuf, Osama, Hossen, Imtiaz, Daniels, Matthew W., Lueker-Boden, Martin, Dienstfrey, Andrew, Adam, Gina C.
Data-driven modeling approaches such as jump tables are promising techniques to model populations of resistive random-access memory (ReRAM) or other emerging memory devices for hardware neural network simulations. As these tables rely on data interpolation, this work explores the open questions about their fidelity in relation to the stochastic device behavior they model. We study how various jump table device models impact the attained network performance estimates, a concept we define as modeling bias. Two methods of jump table device modeling, binning and Optuna-optimized binning, are explored using synthetic data with known distributions for benchmarking purposes, as well as experimental data obtained from TiOx ReRAM devices. Results on a multi-layer perceptron trained on MNIST show that device models based on binning can behave unpredictably particularly at low number of points in the device dataset, sometimes over-promising, sometimes under-promising target network accuracy. This paper also proposes device level metrics that indicate similar trends with the modeling bias metric at the network level. The proposed approach opens the possibility for future investigations into statistical device models with better performance, as well as experimentally verified modeling bias in different in-memory computing and neural network architectures.
Robust Federated Learning for execution time-based device model identification under label-flipping attack
Sánchez, Pedro Miguel Sánchez, Celdrán, Alberto Huertas, Rubio, José Rafael Buendía, Bovet, Gérôme, Pérez, Gregorio Martínez
The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, several solutions have emerged to identify device models and types based on the combination of behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques. However, these solutions are not appropriated for scenarios where data privacy and protection is a must, as they require data centralization for processing. In this context, newer approaches such as Federated Learning (FL) have not been fully explored yet, especially when malicious clients are present in the scenario setup. The present work analyzes and compares the device model identification performance of a centralized DL model with an FL one while using execution time-based events. For experimental purposes, a dataset containing execution-time features of 55 Raspberry Pis belonging to four different models has been collected and published. Using this dataset, the proposed solution achieved 0.9999 accuracy in both setups, centralized and federated, showing no performance decrease while preserving data privacy. Later, the impact of a label-flipping attack during the federated model training is evaluated, using several aggregation mechanisms as countermeasure. Zeno and coordinate-wise median aggregation show the best performance, although their performance greatly degrades when the percentage of fully malicious clients (all training samples poisoned) grows over 50%.
TALKING DATA MOBILE USER DEMOGRAPHICS
Nothing is more comforting than being greeted by your favorite drink just as you walk through the door of the corner café. While a thoughtful barista knows you take a macchiato every Wednesday morning at 8:15, it's much more difficult in a digital space for your preferred brands to personalize your experience. Talking Data, China's largest third-party mobile data platform, understands that everyday choices and behaviors paint a picture of who we are and what we value. Currently, Talking Data is seeking to leverage behavioral data from more than 70% of the 500 million mobile devices active daily in China to help its clients better understand and interact with their audiences. So, the business problem is to predict the demographic characteristics of the users using their app usage,geographical location and device properties.
Amazon's Ring saves user data for every single time the smart doorbell is activated
Amazon's Ring keeps a log of every single time one of its cameras, doorbells or apps is activated and used, it has been revealed. Events that are logged include motion detected by the cameras, a doorbell being activated or pressed by a visitor, or an action by the user to activate a live feed to converse with a visitor. The data recorded also includes exact GPS co-ordinates of the devices as well as the duration of each event in seconds. It has also been found that every time the Ring app is used by a customer, a permanent note of the device model and which network it uses is saved and recorded in a vast database. Experts have slammed the feature and said it poses a'serious threat to people's privacy' and could easily be stolen or misused by criminals.
Development, Demonstration, and Validation of Data-driven Compact Diode Models for Circuit Simulation and Analysis
Aadithya, K., Kuberry, P., Paskaleva, B., Bochev, P., Leeson, K., Mar, A., Mei, T., Keiter, E.
Compact semiconductor device models are essential for efficiently designing and analyzing large circuits. However, traditional compact model development requires a large amount of manual effort and can span many years. Moreover, inclusion of new physics (eg, radiation effects) into an existing compact model is not trivial and may require redevelopment from scratch. Machine Learning (ML) techniques have the potential to automate and significantly speed up the development of compact models. In addition, ML provides a range of modeling options that can be used to develop hierarchies of compact models tailored to specific circuit design stages. In this paper, we explore three such options: (1) table-based interpolation, (2)Generalized Moving Least-Squares, and (3) feed-forward Deep Neural Networks, to develop compact models for a p-n junction diode. We evaluate the performance of these "data-driven" compact models by (1) comparing their voltage-current characteristics against laboratory data, and (2) building a bridge rectifier circuit using these devices, predicting the circuit's behavior using SPICE-like circuit simulations, and then comparing these predictions against laboratory measurements of the same circuit.