Training Physical Neural Networks for Analog In-Memory Computing

Sakemi, Yusuke, Okamoto, Yuji, Morie, Takashi, Nobukawa, Sou, Hosomi, Takeo, Aihara, Kazuyuki

arXiv.org Artificial Intelligence 

Deep learning is a state-of-the-art methodology in numerous domains, including image recognition, natural language processing, and data generation [1]. The discovery of scaling laws in deep learning models [2, 3] has motivated the development of increasingly larger models, commonly referred to as foundation models [4, 5, 6]. Recent studies have shown that reasoning tasks can be improved through iterative computations during the inference phase [7]. While computational power continues to be a major driver of artificial intelligence (AI) advancements, the associated costs remain a significant barrier to broader adoption across diverse industries [8, 9]. This issue is especially critical in edge AI systems, where energy consumption is constrained by the limited capacity of batteries, making the need for more efficient computation paramount [10]. One promising strategy to enhance energy efficiency is fabricating dedicated hardware. Since matrixvector multiplication is the computational core in deep learning, parallelization greatly enhances computational efficiency [11]. Moreover, in data-driven applications such as deep learning, a substantial portion of power consumption is due to data movement between the processor and memory, commonly referred to as the von Neumann bottleneck [12].