magnetic tunnel junction
Securing generative artificial intelligence with parallel magnetic tunnel junction true randomness
Bao, Youwei, Yang, Shuhan, Yang, Hyunsoo
Deterministic pseudo random number generators (PRNGs) used in generative artificial intelligence (GAI) models produce predictable patterns vulnerable to exploitation by attackers. Conventional defences against the vulnerabilities often come with significant energy and latency overhead. Here, we embed hardware-generated true random bits from spin-transfer torque magnetic tunnel junctions (STT-MTJs) to address the challenges. A highly parallel, FPGA-assisted prototype computing system delivers megabit-per-second true random numbers, passing NIST randomness tests after in-situ operations with minimal overhead. Integrating the hardware random bits into a generative adversarial network (GAN) trained on CIFAR-10 reduces insecure outputs by up to 18.6 times compared to the low-quality random number generators (RNG) baseline. With nanosecond switching speed, high energy efficiency, and established scalability, our STT-MTJ-based system holds the potential to scale beyond 106 parallel cells, achieving gigabit-per-second throughput suitable for large language model sampling. This advancement highlights spintronic RNGs as practical security components for next-generation GAI systems.
Bayesian Reasoning Enabled by Spin-Orbit Torque Magnetic Tunnel Junctions
Xu, Yingqian, Li, Xiaohan, Wan, Caihua, Zhang, Ran, He, Bin, Liu, Shiqiang, Xia, Jihao, Kong, Dehao, Xiong, Shilong, Yu, Guoqiang, Han, Xiufeng
The rapid development of artificial intelligence (AI) over the past few decades has been nourished by advancements in machine learning algorithms, increased computational power, and availability of vast amounts of data[1], which has in turn revolutionized numerous fields including but not limited to medical science and healthcare, information technologies, finance, transportation, and more. This regenerative feedback between AI and its applications leads to a further explosive growth of data and expansion of model scales, which calls for a paradigm shift toward efficient and speedy computing and memory technologies, especially, advanced algorithms and emerging AI hardware enabled by nonvolatile memories[2]. In this aspect, the emerging memory technologies, such as magnetic random-access memories[3], ferroelectric random-access memories[4], resistive random-access memories[5, 6] and phase-change random-access memories[7], have been implemented to accelerate AI computing, for instance, the matrix multiplication[8]. Thanks to their high energy-efficiency, fast speed, long endurance, and versatile functionalities, spin-tronic devices based on spin-orbit torques as one prominent example among emerging memories, have shown great potential in the aspect of hardware-accelerated true random number generation (TRNG)[9-18] besides of the matrix multiplication. For instance, the high quality true random number generators with stable and reconfigurable probability-tunability have been demonstrated using SOT -MTJs [19-21].
Noise-based Local Learning using Stochastic Magnetic Tunnel Junctions
Koenders, Kees, Schnitzpan, Leo, Kammerbauer, Fabian, Shu, Sinan, Jakob, Gerhard, Klรคui, Mathis, Mentink, Johan, Ahmad, Nasir, van Gerven, Marcel
Brain-inspired learning in physical hardware has enormous potential to learn fast at minimal energy expenditure. One of the characteristics of biological learning systems is their ability to learn in the presence of various noise sources. Inspired by this observation, we introduce a novel noise-based learning approach for physical systems implementing multi-layer neural networks. Simulation results show that our approach allows for effective learning whose performance approaches that of the conventional effective yet energy-costly backpropagation algorithm. Using a spintronics hardware implementation, we demonstrate experimentally that learning can be achieved in a small network composed of physical stochastic magnetic tunnel junctions. These results provide a path towards efficient learning in general physical systems which embraces rather than mitigates the noise inherent in physical devices.
AI-Guided Codesign Framework for Novel Material and Device Design applied to MTJ-based True Random Number Generators
Patel, Karan P., Maicke, Andrew, Arzate, Jared, Kwon, Jaesuk, Smith, J. Darby, Aimone, James B., Incorvia, Jean Anne C., Cardwell, Suma G., Schuman, Catherine D.
Designing devices for novel applications is oftentimes a time rigorous and resource-constrained process that requires utilizing computationally intensive simulations, device fabrication, and testing of the physical components in the application-specific environment. At the same time, customizing device characteristics to a particular application can allow for significant performance improvements. Automated codesign strategies are becoming increasingly popular with advancements in the artificial intelligence (AI) field that provide useful machine learning algorithms and frameworks [1-4]. Such codesign provides new opportunities to automatically customize devices for application-specific needs to maximize performance--whether that involves a particular capability, energy usage, latency, throughput, or even combinations of metrics. The operation of emerging devices, such as magnetic tunnel junctions (MTJs) [5-8], can be simulated using physics-based models that capture key behaviors based on materials and device properties.
Machine Learning Quantum Systems with Magnetic p-bits
Chowdhury, Shuvro, Camsari, Kerem Y.
The slowing down of Moore's Law has led to a crisis as the computing workloads of Artificial Intelligence (AI) algorithms continue skyrocketing. There is an urgent need for scalable and energy-efficient hardware catering to the unique requirements of AI algorithms and applications. In this environment, probabilistic computing with p-bits emerged as a scalable, domain-specific, and energy-efficient computing paradigm, particularly useful for probabilistic applications and algorithms. In particular, spintronic devices such as stochastic magnetic tunnel junctions (sMTJ) show great promise in designing integrated p-computers. Here, we examine how a scalable probabilistic computer with such magnetic p-bits can be useful for an emerging field combining machine learning and quantum physics.
Classification of multi-frequency RF signals by extreme learning, using magnetic tunnel junctions as neurons and synapses
Leroux, Nathan, Markoviฤ, Danijela, Sanz-Hernรกndez, Dรฉdalo, Trastoy, Juan, Bortolotti, Paolo, Schulman, Alejandro, Benetti, Luana, Jenkins, Alex, Ferreira, Ricardo, Grollier, Julie, Mizrahi, Alice
Extracting information from radiofrequency (RF) signals using artificial neural networks at low energy cost is a critical need for a wide range of applications from radars to health. These RF inputs are composed of multiples frequencies. Here we show that magnetic tunnel junctions can process analogue RF inputs with multiple frequencies in parallel and perform synaptic operations. Using a backpropagation-free method called extreme learning, we classify noisy images encoded by RF signals, using experimental data from magnetic tunnel junctions functioning as both synapses and neurons. We achieve the same accuracy as an equivalent software neural network. These results are a key step for embedded radiofrequency artificial intelligence.
Implementation of a Binary Neural Network on a Passive Array of Magnetic Tunnel Junctions
Goodwill, Jonathan M., Prasad, Nitin, Hoskins, Brian D., Daniels, Matthew W., Madhavan, Advait, Wan, Lei, Santos, Tiffany S., Tran, Michael, Katine, Jordan A., Braganca, Patrick M., Stiles, Mark D., McClelland, Jabez J.
Avenues to mitigate the main issue, the von Neumann bottleneck, include in-memory and near-memory architectures, as well as algorithmic approaches. Here we leverage the low-power and the inherently binary operation of magnetic tunnel junctions (MTJs) to demonstrate neural network hardware inference based on passive arrays of MTJs. In general, transferring a trained network model to hardware for inference is confronted by degradation in performance due to device-todevice variations, write errors, parasitic resistance, and nonidealities in the substrate. To quantify the effect of these hardware realities, we benchmark 300 unique weight matrix solutions of a 2-layer perceptron to classify the Wine dataset for both classification accuracy and write fidelity. Despite device imperfections, we achieve software-equivalent accuracy of up to 95.3 % with proper tuning of network parameters in 15 15 MTJ arrays having a range of device sizes. The success of this tuning process shows that new metrics are needed to characterize the performance and quality of networks reproduced in mixed signal hardware. I. INTRODUCTION Over the past decade, artificial intelligence algorithms have achieved human-level performance on increasingly complex tasks at the cost of increased neural network size, computing resources, and energy consumption [1-5]. OpenAI's GPT-3, for example, a state-ot-the-art natural language processor, contains 175 billion parameters and requires 3.14 10 Running these algorithms for inference applications--applications that require the model to make predictions but not learn new information--requires lesser but still overwhelming amounts of energy. This energy inefficiency is in part due to implementing these algorithms using general-purpose hardware such as central and graphical processing units (CPUs and GPUs). Because CPUs and GPUs have traditional von Neumann computing architectures, they do not store data in the same spatial location as where computation is carried out. For this reason, energy is consumed in moving the data, and the speed of computation is throttled by the time it takes to shuttle from the storage to the computation location. This so-called von Neumann bottleneck has been shown to be severe on large neural network models, with studies showing the majority of the network time and energy can be expended distributing gradient and model data [11-13]. Algorithmic approaches to lessening the data bottleneck have focused on simplifying neural network models to achieve equivalent accuracy with less memory overhead.
Implementation of a Binary Neural Network on a Passive Array of Magnetic Tunnel Junctions
Avenues to mitigate the main issue, the von Neumann bottleneck, include in-memory and near-memory architectures, as well as algorithmic approaches. Here we leverage the low-power and the inherently binary operation of magnetic tunnel junctions (MTJs) to demonstrate neural network hardware inference based on passive arrays of MTJs. In general, transferring a trained network model to hardware for inference is confronted by degradation in performance due to device-to-device variations, write errors, parasitic resistance, and nonidealities in the substrate. To quantify the effect of these hardware realities, we benchmark 300 unique weight matrix solutions of a 2-layer perceptron to classify the Wine dataset for both classification accuracy and write fidelity. Despite device imperfections, we achieve software-equivalent accuracy of up to 95.3 network parameters in 15 x 15 MTJ arrays having a range of device sizes.
A Vision to Compute like Nature
Classical computing using digital symbols--equivalent to a Turing Machine--is reaching its limits. It is undeniable that computing's historic exponential performance increases have improved the human condition. Yet such increases are a thing of the past due in large part to the constraints of physics and how today's systems are constructed. Hardware device designers struggle to eliminate the effects of nanometer-scale thermodynamic fluctuations, and the soaring cost of fabrication plants has eliminated all but a few companies as a source of future chips. Software developers' ability to imagine and program effective computational abstractions and implementations are clearly challenged in complex domains like economic systems, ecological systems, medicine, social systems, warfare, and autonomous vehicles.