AITopics

2411.17367

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > Promising Solution (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

arXiv.org Artificial IntelligenceAug-11-2024

Approximate ADCs for In-Memory Computing

Ghosh, Arkapravo, Sadana, Hemkar Reddy, Debnath, Mukut, Maji, Panthadip, Negi, Shubham, Gupta, Sumeet, Sharad, Mrigank, Roy, Kaushik

In memory computing (IMC) architectures for deep learning (DL) accelerators leverage energy-efficient and highly parallel matrix vector multiplication (MVM) operations, implemented directly in memory arrays. Such IMC designs have been explored based on CMOS as well as emerging non-volatile memory (NVM) technologies like RRAM. IMC architectures generally involve a large number of cores consisting of memory arrays, storing the trained weights of the DL model. Peripheral units like DACs and ADCs are also used for applying inputs and reading out the output values. Recently reported designs reveal that the ADCs required for reading out the MVM results, consume more than 85% of the total compute power and also dominate the area, thereby eschewing the benefits of the IMC scheme. Mitigation of imperfections in the ADCs, namely, non-linearity and variations, incur significant design overheads, due to dedicated calibration units. In this work we present peripheral aware design of IMC cores, to mitigate such overheads. It involves incorporating the non-idealities of ADCs in the training of the DL models, along with that of the memory units. The proposed approach applies equally well to both current mode as well as charge mode MVM operations demonstrated in recent years., and can significantly simplify the design of mixed-signal IMC units.

adc, opération, variation, (14 more...)

2408.0639

Country:

Europe (0.04)
Asia > India > West Bengal > Kharagpur (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

arXiv.org Artificial IntelligenceJul-26-2024

Topology Optimization of Random Memristors for Input-Aware Dynamic SNN

Wang, Bo, Wang, Shaocong, Lin, Ning, Li, Yi, Yu, Yifei, Zhang, Yue, Yang, Jichang, Wu, Xiaoshan, He, Yangu, Wang, Songqi, Chen, Rui, Li, Guoqi, Qi, Xiaojuan, Wang, Zhongrui, Shang, Dashan

There is unprecedented development in machine learning, exemplified by recent large language models and world simulators, which are artificial neural networks running on digital computers. However, they still cannot parallel human brains in terms of energy efficiency and the streamlined adaptability to inputs of different difficulties, due to differences in signal representation, optimization, run-time reconfigurability, and hardware architecture. To address these fundamental challenges, we introduce pruning optimization for input-aware dynamic memristive spiking neural network (PRIME). Signal representation-wise, PRIME employs leaky integrate-and-fire neurons to emulate the brain's inherent spiking mechanism. Drawing inspiration from the brain's structural plasticity, PRIME optimizes the topology of a random memristive spiking neural network without expensive memristor conductance fine-tuning. For runtime reconfigurability, inspired by the brain's dynamic adjustment of computational depth, PRIME employs an input-aware dynamic early stop policy to minimize latency during inference, thereby boosting energy efficiency without compromising performance. Architecture-wise, PRIME leverages memristive in-memory computing, mirroring the brain and mitigating the von Neumann bottleneck. We validated our system using a 40 nm 256 Kb memristor-based in-memory computing macro on neuromorphic image classification and image inpainting. Our results demonstrate the classification accuracy and Inception Score are comparable to the software baseline, while achieving maximal 62.50-fold improvements in energy efficiency, and maximal 77.0% computational load savings. The system also exhibits robustness against stochastic synaptic noise of analogue memristors. Our software-hardware co-designed model paves the way to future brain-inspired neuromorphic computing with brain-like energy efficiency and adaptivity.

dataset, neural network, optimization, (15 more...)

2407.18625

Country:

Asia > China > Hong Kong (0.05)
North America > United States > Texas (0.05)
Asia > China > Beijing > Beijing (0.05)

Genre: Research Report > New Finding (0.86)

Industry:

Semiconductors & Electronics (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceFeb-12-2024

A Precision-Optimized Fixed-Point Near-Memory Digital Processing Unit for Analog In-Memory Computing

Ferro, Elena, Vasilopoulos, Athanasios, Lammie, Corey, Gallo, Manuel Le, Benini, Luca, Boybat, Irem, Sebastian, Abu

Analog In-Memory Computing (AIMC) is an emerging technology for fast and energy-efficient Deep Learning (DL) inference. However, a certain amount of digital post-processing is required to deal with circuit mismatches and non-idealities associated with the memory devices. Efficient near-memory digital logic is critical to retain the high area/energy efficiency and low latency of AIMC. Existing systems adopt Floating Point 16 (FP16) arithmetic with limited parallelization capability and high latency. To overcome these limitations, we propose a Near-Memory digital Processing Unit (NMPU) based on fixed-point arithmetic. It achieves competitive accuracy and higher computing throughput than previous approaches while minimizing the area overhead. Moreover, the NMPU supports standard DL activation steps, such as ReLU and Batch Normalization. We perform a physical implementation of the NMPU design in a 14 nm CMOS technology and provide detailed performance, power, and area assessments. We validate the efficacy of the NMPU by using data from an AIMC chip and demonstrate that a simulated AIMC system with the proposed NMPU outperforms existing FP16-based implementations, providing 139$\times$ speed-up, 7.8$\times$ smaller area, and a competitive power consumption. Additionally, our approach achieves an inference accuracy of 86.65 %/65.06 %, with an accuracy drop of just 0.12 %/0.4 % compared to the FP16 baseline when benchmarked with ResNet9/ResNet32 networks trained on the CIFAR10/CIFAR100 datasets, respectively.

affine correction, architecture, in-memory computing, (13 more...)

2402.07549

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Massachusetts > Middlesex County > Waltham (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Artificial IntelligenceNov-13-2023

Pruning random resistive memory for optimizing analogue AI

Li, Yi, Wang, Songqi, Zhao, Yaping, Wang, Shaocong, Zhang, Woyu, He, Yangu, Lin, Ning, Cui, Binbin, Chen, Xi, Zhang, Shiming, Jiang, Hao, Lin, Peng, Zhang, Xumeng, Qi, Xiaojuan, Wang, Zhongrui, Xu, Xiaoxin, Shang, Dashan, Liu, Qi, Cheng, Kwang-Ting, Liu, Ming

The rapid advancement of artificial intelligence (AI) has been marked by the large language models exhibiting human-like intelligence. However, these models also present unprecedented challenges to energy consumption and environmental sustainability. One promising solution is to revisit analogue computing, a technique that predates digital computing and exploits emerging analogue electronic devices, such as resistive memory, which features in-memory computing, high scalability, and nonvolatility. However, analogue computing still faces the same challenges as before: programming nonidealities and expensive programming due to the underlying devices physics. Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning to optimize the topology of a randomly weighted analogue resistive memory neural network. Software-wise, the topology of a randomly weighted neural network is optimized by pruning connections rather than precisely tuning resistive memory weights. Hardware-wise, we reveal the physical origin of the programming stochasticity using transmission electron microscopy, which is leveraged for large-scale and low-cost implementation of an overparameterized random neural network containing high-performance sub-networks. We implemented the co-design on a 40nm 256K resistive memory macro, observing 17.3% and 19.9% accuracy improvements in image and audio classification on FashionMNIST and Spoken digits datasets, as well as 9.8% (2%) improvement in PR (ROC) in image segmentation on DRIVE datasets, respectively. This is accompanied by 82.1%, 51.2%, and 99.8% improvement in energy efficiency thanks to analogue in-memory computing. By embracing the intrinsic stochasticity and in-memory computing, this work may solve the biggest obstacle of analogue computing systems and thus unleash their immense potential for next-generation AI hardware.

neural network, optimization, topology optimization, (14 more...)

2311.07164

Country:

Asia > China > Hong Kong (0.05)
North America > United States > Texas (0.04)
Asia > China > Beijing > Beijing (0.04)
(4 more...)

Genre: Research Report (0.84)

Industry:

Semiconductors & Electronics (1.00)
Education (0.93)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

#artificialintelligenceApr-16-2023, 07:05:51 GMT

Enabling In-Memory Computing for Artificial Intelligence Part 1: The Analog Approach - Intel Communities

Hechen Wang is a research scientist for Intel Labs with interests in mixed-signal circuits, data converters, digital frequency synthesizers, wireless communication systems, and analog/mixed-signal compute-in-memory for AI applications. The fundamental building block of computer memory is the memory cell; an electronic circuit that stores binary information. In the conventional approach to data processing, the data resides on a hard disk in the system or attached by a network. When needed, it's called into the local system memory, or RAM, and then moves to the CPU. The lengthy process is relatively inefficient, so researchers began to seek an alternative.

application, computing, in-memory computing, (13 more...)

Industry: Information Technology > Software (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)
Information Technology > Hardware > Memory (0.69)
Information Technology > Communications > Networks (0.69)

Amin, Md Hasibul, Elbtity, Mohammed, Zand, Ramtin

Interconnect Parasitics and Partitioning in Fully-Analog In-Memory Computing Architectures

arXiv.org Artificial IntelligenceJan-28-2022

Fully-analog in-memory computing (IMC) architectures that implement both matrix-vector multiplication and non-linear vector operations within the same memory array have shown promising performance benefits over conventional IMC systems due to the removal of energy-hungry signal conversion units. However, maintaining the computation in the analog domain for the entire deep neural network (DNN) comes with potential sensitivity to interconnect parasitics. Thus, in this paper, we investigate the effect of wire parasitic resistance and capacitance on the accuracy of DNN models deployed on fully-analog IMC architectures. Moreover, we propose a partitioning mechanism to alleviate the impact of the parasitic while keeping the computation in the analog domain through dividing large arrays into multiple partitions. The SPICE circuit simulation results for a 400 X 120 X 84 X 10 DNN model deployed on a fully-analog IMC circuit show that a 94.84% accuracy could be achieved for MNIST classification application with 16, 8, and 8 horizontal partitions, as well as 8, 8, and 1 vertical partitions for first, second, and third layers of the DNN, respectively, which is comparable to the ~97% accuracy realized by digital implementation on CPU. It is shown that accuracy benefits are achieved at the cost of higher power consumption due to the extra circuitry required for handling partitioning.

architecture, artificial intelligence, machine learning, (17 more...)

doi: 10.1109/ISCAS48785.2022.9937884

2201.1248

Country:

North America > United States > South Carolina > Richland County > Columbia (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Industry: Semiconductors & Electronics (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

#artificialintelligenceJan-15-2022, 00:01:59 GMT

Samsung is working on artificial intelligence chips that use in-memory computing.

Samsung Electronics has announced the development of an in-memory computing system that combines memory and system semiconductors. For the first time, non-volatile memories, dubbed "magnetoresistive random access memory," are being used to enable the new technology, according to the world's largest memory chipmaker. Samsung has announced the development of an in-memory computing technology that combines memory and system semiconductors. For the first time in the world, non-volatile memories, dubbed "magnetoresistive random access memory," are enabling the new technology, according to the world's largest memory chipmaker. Data is stored in memory chips and computed by separate processor chips in a traditional computer architecture.

artificial intelligence chip, computing, in-memory computing, (8 more...)

Country: North America > United States (0.06)

Industry: Semiconductors & Electronics (1.00)

Technology: Information Technology > Artificial Intelligence (0.77)

#artificialintelligenceDec-12-2020, 05:30:35 GMT

Rounding Up Machine Learning Developments From 2020

The year 2020 saw many exciting developments in machine learning. As the year 2020 comes to an end, here is a roundup of these innovations in various machine learning domains such as reinforcement learning, Natural Language Processing, ML frameworks such as Pytorch and TensorFlow, and more. Arm-based Graviton processors went mainstream in 2020, which utilize 30 billion transistors with 64-bit Arm cores built by Israeli-based engineering company Annapurna Labs. AWS recently acquired it for powering memory-intensive workloads like real-time big data analytics. It showed a 40% performance improvement emerging as an alternative to x86-based processors for machine learning, shifting the trend from the Intel-dominated cloud market to Arm-based Graviton processors.

algorithm, neural network, reinforcement, (14 more...)

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceOct-9-2020, 16:07:03 GMT

BANKING: MAKING AI IN CUSTOMER SERVICE A REALITY

Banks are constantly looking for opportunities to up- or cross-sell products to customers. Increasing product penetration from 2.5 products to 4 products per customer can add millions to the bottom line and it is estimated to be 5 – 10 times cheaper to up- or cross-sell to an existing customer than to acquire a new one. Combining in-memory computing with AI opens up new opportunities to do so. When it comes to engaging customers in up- or cross-selling conversations, timing is everything. Customers are far more likely to be receptive to an approach when they are already interacting with the bank – online, via the telephone, or in branch.

customer, machine learning, real time system, (8 more...)

Industry: Banking & Finance (0.34)

Technology:

Information Technology > Architecture > Real Time Systems (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.33)