AITopics | Neftci, Emre

Collaborating Authors

Neftci, Emre

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Grid Cell-Inspired Structured Vector Algebra for Cognitive Maps

Krausse, Sven, Neftci, Emre, Sommer, Friedrich T., Renner, Alpha

arXiv.org Artificial IntelligenceMar-11-2025

The entorhinal-hippocampal formation is the mammalian brain's navigation system, encoding both physical and abstract spaces via grid cells. This system is well-studied in neuroscience, and its efficiency and versatility make it attractive for applications in robotics and machine learning. While continuous attractor networks (CANs) successfully model entorhinal grid cells for encoding physical space, integrating both continuous spatial and abstract spatial computations into a unified framework remains challenging. Here, we attempt to bridge this gap by proposing a mechanistic model for versatile information processing in the entorhinal-hippocampal formation inspired by CANs and Vector Symbolic Architectures (VSAs), a neuro-symbolic computing framework. The novel grid-cell VSA (GC-VSA) model employs a spatially structured encoding scheme with 3D neuronal modules mimicking the discrete scales and orientations of grid cell modules, reproducing their characteristic hexagonal receptive fields. In experiments, the model demonstrates versatility in spatial and abstract tasks: (1) accurate path integration for tracking locations, (2) spatio-temporal representation for querying object locations and temporal relations, and (3) symbolic reasoning using family trees as a structured test case for hierarchical relationships.

artificial intelligence, machine learning, vector, (17 more...)

arXiv.org Artificial Intelligence

2503.08608

Country:

North America > United States (0.68)
Europe > Norway > Norwegian Sea (0.24)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

A Truly Sparse and General Implementation of Gradient-Based Synaptic Plasticity

Lohoff, Jamie, Kaya, Anil, Assmuth, Florian, Neftci, Emre

arXiv.org Artificial IntelligenceJan-20-2025

Online synaptic plasticity rules derived from gradient descent achieve high accuracy on a wide range of practical tasks. However, their software implementation often requires tediously hand-derived gradients or using gradient backpropagation which sacrifices the online capability of the rules. In this work, we present a custom automatic differentiation (AD) pipeline for sparse and online implementation of gradient-based synaptic plasticity rules that generalizes to arbitrary neuron models. Our work combines the programming ease of backpropagation-type methods for forward AD while being memory-efficient. To achieve this, we exploit the advantageous compute and memory scaling of online synaptic plasticity by providing an inherently sparse implementation of AD where expensive tensor contractions are replaced with simple element-wise multiplications if the tensors are diagonal. Gradient-based synaptic plasticity rules such as eligibility propagation (e-prop) have exactly this property and thus profit immensely from this feature. We demonstrate the alignment of our gradients with respect to gradient backpropagation on an synthetic task where e-prop gradients are exact, as well as audio speech classification benchmarks. We demonstrate how memory utilization scales with network size without dependence on the sequence length, as expected from forward AD methods.

artificial intelligence, implementation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.11407

Country: Europe > Germany (0.15)

Genre: Research Report (0.40)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Optimal Gradient Checkpointing for Sparse and Recurrent Architectures using Off-Chip Memory

Bencheikh, Wadjih, Finkbeiner, Jan, Neftci, Emre

arXiv.org Artificial IntelligenceDec-16-2024

Recurrent neural networks (RNNs) are valued for their computational efficiency and reduced memory requirements on tasks involving long sequence lengths but require high memory-processor bandwidth to train. Checkpointing techniques can reduce the memory requirements by only storing a subset of intermediate states, the checkpoints, but are still rarely used due to the computational overhead of the additional recomputation phase. This work addresses these challenges by introducing memory-efficient gradient checkpointing strategies tailored for the general class of sparse RNNs and Spiking Neural Networks (SNNs). SNNs are energy efficient alternatives to RNNs thanks to their local, event-driven operation and potential neuromorphic implementation. We use the Intelligence Processing Unit (IPU) as an exemplary platform for architectures with distributed local memory. We exploit its suitability for sparse and irregular workloads to scale SNN training on long sequence lengths. We find that Double Checkpointing emerges as the most effective method, optimizing the use of local memory resources while minimizing recomputation overhead. This approach reduces dependency on slower large-scale memory access, enabling training on sequences over 10 times longer or 4 times larger networks than previously feasible, with only marginal time overhead. The presented techniques demonstrate significant potential to enhance scalability and efficiency in training sparse and recurrent networks across diverse hardware platforms, and highlights the benefits of sparse activations for scalable recurrent neural network training.

artificial intelligence, checkpointing, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.1181

Country: Europe (0.29)

Genre: Research Report (0.82)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language Models

Leroux, Nathan, Manea, Paul-Philipp, Sudarshan, Chirag, Finkbeiner, Jan, Siegel, Sebastian, Strachan, John Paul, Neftci, Emre

arXiv.org Artificial IntelligenceNov-25-2024

Transformer networks, driven by self-attention, are central to Large Language Models. In generative Transformers, self-attention uses cache memory to store token projections, avoiding recomputation at each time step. However, GPU-stored projections must be loaded into SRAM for each new generation step, causing latency and energy bottlenecks. We present a custom self-attention in-memory computing architecture based on emerging charge-based memories called gain cells, which can be efficiently written to store new tokens during sequence generation and enable parallel analog dot-product computation required for self-attention. However, the analog gain cell circuits introduce non-idealities and constraints preventing the direct mapping of pre-trained models. To circumvent this problem, we design an initialization algorithm achieving text processing performance comparable to GPT-2 without training from scratch. Our architecture respectively reduces attention latency and energy consumption by up to two and five orders of magnitude compared to GPUs, marking a significant step toward ultra-fast, low-power generative Transformers.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.19315

Country:

Europe > Germany (0.14)
North America > United States (0.14)
Europe > Spain (0.14)

Genre: Research Report (0.83)

Industry: Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Zero-Shot Temporal Resolution Domain Adaptation for Spiking Neural Networks

Karilanova, Sanja, Fabre, Maxime, Neftci, Emre, Özçelikkale, Ayça

arXiv.org Artificial IntelligenceNov-7-2024

Spiking Neural Networks (SNNs) are biologically-inspired deep neural networks that efficiently extract temporal information while offering promising gains in terms of energy efficiency and latency when deployed on neuromorphic devices. However, SNN model parameters are sensitive to temporal resolution, leading to significant performance drops when the temporal resolution of target data at the edge is not the same with that of the pre-deployment source data used for training, especially when fine-tuning is not possible at the edge. To address this challenge, we propose three novel domain adaptation methods for adapting neuron parameters to account for the change in time resolution without re-training on target time-resolution. The proposed methods are based on a mapping between neuron dynamics in SNNs and State Space Models (SSMs); and are applicable to general neuron models. We evaluate the proposed methods under spatio-temporal data tasks, namely the audio keyword spotting datasets SHD and MSWC as well as the image classification NMINST dataset. Our methods provide an alternative to - and in majority of the cases significantly outperform - the existing reference method that simply scales the time constant. Moreover, our results show that high accuracy on high temporal resolution data can be obtained by time efficient training on lower temporal resolution data and model adaptation.

artificial intelligence, machine learning, resolution, (20 more...)

arXiv.org Artificial Intelligence

2411.0476

Country: Europe > Sweden (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Optimizing Automatic Differentiation with Deep Reinforcement Learning

Lohoff, Jamie, Neftci, Emre

arXiv.org Artificial IntelligenceJun-16-2024

Computing Jacobians with automatic differentiation is ubiquitous in many scientific domains such as machine learning, computational fluid dynamics, robotics and finance. Even small savings in the number of computations or memory usage in Jacobian computations can already incur massive savings in energy consumption and runtime. While there exist many methods that allow for such savings, they generally trade computational efficiency for approximations of the exact Jacobian. In this paper, we present a novel method to optimize the number of necessary multiplications for Jacobian computation by leveraging deep reinforcement learning (RL) and a concept called cross-country elimination while still computing the exact Jacobian. Cross-country elimination is a framework for automatic differentiation that phrases Jacobian accumulation as ordered elimination of all vertices on the computational graph where every elimination incurs a certain computational cost. We formulate the search for the optimal elimination order that minimizes the number of necessary multiplications as a single player game which is played by an RL agent. We demonstrate that this method achieves up to 33% improvements over state-of-the-art methods on several relevant tasks taken from diverse domains. Furthermore, we show that these theoretical gains translate into actual runtime improvements by providing a cross-country elimination interpreter in JAX that can efficiently execute the obtained elimination orders.

elimination order, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2406.05027

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.68)

Industry:

Banking & Finance (0.67)
Energy (0.66)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Distributed Representations Enable Robust Multi-Timescale Computation in Neuromorphic Hardware

Cotteret, Madison, Greatorex, Hugh, Renner, Alpha, Chen, Junren, Neftci, Emre, Wu, Huaqiang, Indiveri, Giacomo, Ziegler, Martin, Chicca, Elisabetta

arXiv.org Artificial IntelligenceMay-2-2024

Programming recurrent spiking neural networks (RSNNs) to robustly perform multi-timescale computation remains a difficult challenge. To address this, we show how the distributed approach offered by vector symbolic architectures (VSAs), which uses high-dimensional random vectors as the smallest units of representation, can be leveraged to embed robust multi-timescale dynamics into attractor-based RSNNs. We embed finite state machines into the RSNN dynamics by superimposing a symmetric autoassociative weight matrix and asymmetric transition terms. The transition terms are formed by the VSA binding of an input and heteroassociative outer-products between states. Our approach is validated through simulations with highly non-ideal weights; an experimental closed-loop memristive hardware setup; and on Loihi 2, where it scales seamlessly to large state machines. This work demonstrates the effectiveness of VSA representations for embedding robust computation with recurrent dynamics into neuromorphic hardware, without requiring parameter fine-tuning or significant platform-specific optimisation. This advances VSAs as a high-level representation-invariant abstract language for cognitive algorithms in neuromorphic hardware.

artificial intelligence, machine learning, neuron, (16 more...)

arXiv.org Artificial Intelligence

2405.01305

Country:

North America > United States (0.93)
Europe (0.93)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

A Hybrid SNN-ANN Network for Event-based Object Detection with Spatial and Temporal Attention

Ahmed, Soikat Hasan, Finkbeiner, Jan, Neftci, Emre

arXiv.org Artificial IntelligenceMar-15-2024

Event cameras offer high temporal resolution and dynamic range with minimal motion blur, making them promising for object detection tasks. While Spiking Neural Networks (SNNs) are a natural match for event-based sensory data and enable ultra-energy efficient and low latency inference on neuromorphic hardware, Artificial Neural Networks (ANNs) tend to display more stable training dynamics and faster convergence resulting in greater task performance. Hybrid SNN-ANN approaches are a promising alternative, enabling to leverage the strengths of both SNN and ANN architectures. In this work, we introduce the first Hybrid Attention-based SNN-ANN backbone for object detection using event cameras. We propose a novel Attention-based SNN-ANN bridge module to capture sparse spatial and temporal relations from the SNN layer and convert them into dense feature maps for the ANN part of the backbone. Experimental results demonstrate that our proposed method surpasses baseline hybrid and SNN-based approaches by significant margins, with results comparable to existing ANN-based methods. Extensive ablation studies confirm the effectiveness of our proposed modules and architectural choices. These results pave the way toward a hybrid SNN-ANN architecture that achieves ANN like performance at a drastically reduced parameter budget. We implemented the SNN blocks on digital neuromorphic hardware to investigate latency and power consumption and demonstrate the feasibility of our approach.

artificial intelligence, detection, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.10173

Country:

Europe > Germany (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Harnessing Manycore Processors with Distributed Memory for Accelerated Training of Sparse and Recurrent Models

Finkbeiner, Jan, Gmeinder, Thomas, Pupilli, Mark, Titterton, Alexander, Neftci, Emre

arXiv.org Artificial IntelligenceNov-7-2023

Current AI training infrastructure is dominated by single instruction multiple data (SIMD) and systolic array architectures, such as Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs), that excel at accelerating parallel workloads and dense vector matrix multiplications. Potentially more efficient neural network models utilizing sparsity and recurrence cannot leverage the full power of SIMD processor and are thus at a severe disadvantage compared to today's prominent parallel architectures like Transformers and CNNs, thereby hindering the path towards more sustainable AI. To overcome this limitation, we explore sparse and recurrent model training on a massively parallel multiple instruction multiple data (MIMD) architecture with distributed local memory. We implement a training routine based on backpropagation through time (BPTT) for the brain-inspired class of Spiking Neural Networks (SNNs) that feature binary sparse activations. We observe a massive advantage in using sparse activation tensors with a MIMD processor, the Intelligence Processing Unit (IPU) compared to GPUs. On training workloads, our results demonstrate 5-10x throughput gains compared to A100 GPUs and up to 38x gains for higher levels of activation sparsity, without a significant slowdown in training convergence or reduction in final model performance. Furthermore, our results show highly promising trends for both single and multi IPU configurations as we scale up to larger model sizes. Our work paves the way towards more efficient, non-standard models via AI training hardware beyond GPUs, and competitive large scale SNN models.

artificial intelligence, harnessing manycore processor, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2311.04386

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Design Principles for Lifelong Learning AI Accelerators

Kudithipudi, Dhireesha, Daram, Anurag, Zyarah, Abdullah M., Zohora, Fatima Tuz, Aimone, James B., Yanguas-Gil, Angel, Soures, Nicholas, Neftci, Emre, Mattina, Matthew, Lomonaco, Vincenzo, Thiem, Clare D., Epstein, Benjamin

arXiv.org Artificial IntelligenceOct-5-2023

Lifelong learning - an agent's ability to learn throughout its lifetime - is a hallmark of biological learning systems and a central challenge for artificial intelligence (AI). The development of lifelong learning algorithms could lead to a range of novel AI applications, but this will also require the development of appropriate hardware accelerators, particularly if the models are to be deployed on edge platforms, which have strict size, weight, and power constraints. Here, we explore the design of lifelong learning AI accelerators that are intended for deployment in untethered environments. We identify key desirable capabilities for lifelong learning accelerators and highlight metrics to evaluate such accelerators. We then discuss current edge AI accelerators and explore the future design of lifelong learning accelerators, considering the role that different emerging technologies could play.

artificial intelligence, lifelong learning ai accelerator, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2310.04467

Genre: Instructional Material (1.00)

Industry: Education > Educational Setting > Continuing Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback