loihi
Efficient Synaptic Delay Implementation in Digital Event-Driven AI Accelerators
Meijer, Roy, Detterer, Paul, Yousefzadeh, Amirreza, Patino-Saucedo, Alberto, Tang, Guanghzi, Vadivel, Kanishkan, Xu, Yinfu, Gomony, Manil-Dev, Corradi, Federico, Linares-Barranco, Bernabe, Sifalakis, Manolis
Synaptic delay parameterization of neural network models have remained largely unexplored but recent literature has been showing promising results, suggesting the delay parameterized models are simpler, smaller, sparser, and thus more energy efficient than similar performing (e.g. task accuracy) non-delay parameterized ones. We introduce Shared Circular Delay Queue (SCDQ), a novel hardware structure for supporting synaptic delays on digital neuromorphic accelerators. Our analysis and hardware results show that it scales better in terms of memory, than current commonly used approaches, and is more amortizable to algorithm-hardware co-optimizations, where in fact, memory scaling is modulated by model sparsity and not merely network size. Next to memory we also report performance on latency area and energy per inference.
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Europe > Spain > Andalusia > Seville Province > Seville (0.04)
Hardware-aware training of models with synaptic delays for digital event-driven neuromorphic processors
Patino-Saucedo, Alberto, Meijer, Roy, Yousefzadeh, Amirreza, Gomony, Manil-Dev, Corradi, Federico, Detteter, Paul, Garrido-Regife, Laura, Linares-Barranco, Bernabe, Sifalakis, Manolis
Configurable synaptic delays are a basic feature in many neuromorphic neural network hardware accelerators. However, they have been rarely used in model implementations, despite their promising impact on performance and efficiency in tasks that exhibit complex (temporal) dynamics, as it has been unclear how to optimize them. In this work, we propose a framework to train and deploy, in digital neuromorphic hardware, highly performing spiking neural network models (SNNs) where apart from the synaptic weights, the per-synapse delays are also co-optimized. Leveraging spike-based back-propagation-through-time, the training accounts for both platform constraints, such as synaptic weight precision and the total number of parameters per core, as a function of the network size. In addition, a delay pruning technique is used to reduce memory footprint with a low cost in performance. We evaluate trained models in two neuromorphic digital hardware platforms: Intel Loihi and Imec Seneca. Loihi offers synaptic delay support using the so-called Ring-Buffer hardware structure. Seneca does not provide native hardware support for synaptic delays. A second contribution of this paper is therefore a novel area- and memory-efficient hardware structure for acceleration of synaptic delays, which we have integrated in Seneca. The evaluated benchmark involves several models for solving the SHD (Spiking Heidelberg Digits) classification task, where minimal accuracy degradation during the transition from software to hardware is demonstrated. To our knowledge, this is the first work showcasing how to train and deploy hardware-aware models parameterized with synaptic delays, on multicore neuromorphic hardware accelerators.
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- North America > United States (0.04)
- Europe > Spain > Andalusia > Seville Province > Seville (0.04)
- Asia > China (0.04)
Realtime Facial Expression Recognition: Neuromorphic Hardware vs. Edge AI Accelerators
Smith, Heath, Seekings, James, Mohammadi, Mohammadreza, Zand, Ramtin
The paper focuses on real-time facial expression recognition (FER) systems as an important component in various real-world applications such as social robotics. We investigate two hardware options for the deployment of FER machine learning (ML) models at the edge: neuromorphic hardware versus edge AI accelerators. Our study includes exhaustive experiments providing comparative analyses between the Intel Loihi neuromorphic processor and four distinct edge platforms: Raspberry Pi-4, Intel Neural Compute Stick (NSC), Jetson Nano, and Coral TPU. The results obtained show that Loihi can achieve approximately two orders of magnitude reduction in power dissipation and one order of magnitude energy savings compared to Coral TPU which happens to be the least power-intensive and energy-consuming edge AI accelerator. These reductions in power and energy are achieved while the neuromorphic solution maintains a comparable level of accuracy with the edge accelerators, all within the real-time latency requirements.
- North America > United States > South Carolina > Richland County > Columbia (0.14)
- North America > United States > New York > New York County > New York City (0.04)
Evaluating Spiking Neural Network On Neuromorphic Platform For Human Activity Recognition
Energy efficiency and low latency are crucial requirements for designing wearable AI-empowered human activity recognition systems, due to the hard constraints of battery operations and closed-loop feedback. While neural network models have been extensively compressed to match the stringent edge requirements, spiking neural networks and event-based sensing are recently emerging as promising solutions to further improve performance due to their inherent energy efficiency and capacity to process spatiotemporal data in very low latency. This work aims to evaluate the effectiveness of spiking neural networks on neuromorphic processors in human activity recognition for wearable applications. The case of workout recognition with wrist-worn wearable motion sensors is used as a study. A multi-threshold delta modulation approach is utilized for encoding the input sensor data into spike trains to move the pipeline into the event-based approach. The spikes trains are then fed to a spiking neural network with direct-event training, and the trained model is deployed on the research neuromorphic platform from Intel, Loihi, to evaluate energy and latency efficiency. Test results show that the spike-based workouts recognition system can achieve a comparable accuracy (87.5\%) comparable to the popular milliwatt RISC-V bases multi-core processor GAP8 with a traditional neural network ( 88.1\%) while achieving two times better energy-delay product (0.66 \si{\micro\joule\second} vs. 1.32 \si{\micro\joule\second}).
- Europe > Switzerland > Zürich > Zürich (0.15)
- North America > Mexico > Quintana Roo > Cancún (0.05)
- North America > United States > New York > New York County > New York City (0.04)
Energy-Efficient Deployment of Machine Learning Workloads on Neuromorphic Hardware
Chandarana, Peyton, Mohammadi, Mohammadreza, Seekings, James, Zand, Ramtin
As the technology industry is moving towards implementing tasks such as natural language processing, path planning, image classification, and more on smaller edge computing devices, the demand for more efficient implementations of algorithms and hardware accelerators has become a significant area of research. In recent years, several edge deep learning hardware accelerators have been released that specifically focus on reducing the power and area consumed by deep neural networks (DNNs). On the other hand, spiking neural networks (SNNs) which operate on discrete time-series data, have been shown to achieve substantial power reductions over even the aforementioned edge DNN accelerators when deployed on specialized neuromorphic event-based/asynchronous hardware. While neuromorphic hardware has demonstrated great potential for accelerating deep learning tasks at the edge, the current space of algorithms and hardware is limited and still in rather early development. Thus, many hybrid approaches have been proposed which aim to convert pre-trained DNNs into SNNs. In this work, we provide a general guide to converting pre-trained DNNs into SNNs while also presenting techniques to improve the deployment of converted SNNs on neuromorphic hardware with respect to latency, power, and energy. Our experimental results show that when compared against the Intel Neural Compute Stick 2, Intel's neuromorphic processor, Loihi, consumes up to 27x less power and 5x less energy in the tested image classification tasks by using our SNN improvement techniques.
- North America > United States > South Carolina > Richland County > Columbia (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Information Technology (0.88)
- Energy > Power Industry (0.46)
Is Intel Labs' brain-inspired AI approach the future of robot learning?
Join us on November 9 to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers at the Low-Code/No-Code Summit. Can computer systems develop to the point where they can think creatively, identify people or items they have never seen before, and adjust accordingly -- all while working more efficiently, with less power? Intel Labs is betting on it, with a new hardware and software approach using neuromorphic computing, which, according to a recent blog post, "uses new algorithmic approaches that emulate how the human brain interacts with the world to deliver capabilities closer to human cognition." While this may sound futuristic, Intel's neuromorphic computing research is already fostering interesting use cases, including how to add new voice interaction commands to Mercedes-Benz vehicles; create a robotic hand that delivers medications to patients; or develop chips that recognize hazardous chemicals. Machine learning-driven systems, such as autonomous cars, robotics, drones, and other self-sufficient technologies, have relied on ever-smaller, more-powerful, energy-efficient processing chips.
- Information Technology (0.92)
- Education > Educational Setting (0.31)
Braille Letter Reading: A Benchmark for Spatio-Temporal Pattern Recognition on Neuromorphic Hardware
Muller-Cleve, Simon F, Fra, Vittorio, Khacef, Lyes, Pequeno-Zurro, Alejandro, Klepatsch, Daniel, Forno, Evelina, Ivanovich, Diego G, Rastogi, Shavika, Urgese, Gianvito, Zenke, Friedemann, Bartolozzi, Chiara
Spatio-temporal pattern recognition is a fundamental ability of the brain which is required for numerous real-world activities. Recent deep learning approaches have reached outstanding accuracies in such tasks, but their implementation on conventional embedded solutions is still very computationally and energy expensive. Tactile sensing in robotic applications is a representative example where real-time processing and energy efficiency are required. Following a brain-inspired computing approach, we propose a new benchmark for spatio-temporal tactile pattern recognition at the edge through Braille letter reading. We recorded a new Braille letters dataset based on the capacitive tactile sensors of the iCub robot's fingertip. We then investigated the importance of spatial and temporal information as well as the impact of event-based encoding on spike-based computation. Afterward, we trained and compared feedforward and recurrent Spiking Neural Networks (SNNs) offline using Backpropagation Through Time (BPTT) with surrogate gradients, then we deployed them on the Intel Loihi neuromorphic chip for fast and efficient inference. We compared our approach to standard classifiers, in particular to the Long Short-Term Memory (LSTM) deployed on the embedded NVIDIA Jetson GPU, in terms of classification accuracy, power, energy consumption, and delay. Our results show that the LSTM reaches ~97% of accuracy, outperforming the recurrent SNN by ~17% when using continuous frame-based data instead of event-based inputs. However, the recurrent SNN on Loihi with event-based inputs is ~500 times more energy-efficient than the LSTM on Jetson, requiring a total power of only ~30 mW. This work proposes a new benchmark for tactile sensing and highlights the challenges and opportunities of event-based encoding, neuromorphic hardware, and spike-based computing for spatio-temporal pattern recognition at the edge.
- Europe > Austria (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- (8 more...)
Static Hand Gesture Recognition for American Sign Language using Neuromorphic Hardware
Mohammadi, MohammadReza, Chandarana, Peyton, Seekings, James, Hendrix, Sara, Zand, Ramtin
In this paper, we develop four spiking neural network (SNN) models for two static American Sign Language (ASL) hand gesture classification tasks, i.e., the ASL Alphabet and ASL Digits. The SNN models are deployed on Intel's neuromorphic platform, Loihi, and then compared against equivalent deep neural network (DNN) models deployed on an edge computing device, the Intel Neural Compute Stick 2 (NCS2). We perform a comprehensive comparison between the two systems in terms of accuracy, latency, power consumption, and energy. The best DNN model achieves an accuracy of 99.93% on the ASL Alphabet dataset, whereas the best performing SNN model has an accuracy of 99.30%. For the ASL-Digits dataset, the best DNN model achieves an accuracy of 99.76% accuracy while the SNN achieves 99.03%. Moreover, our obtained experimental results show that the Loihi neuromorphic hardware implementations achieve up to 20.64x and 4.10x reduction in power consumption and energy, respectively, when compared to NCS2.
- North America > United States > South Carolina > Richland County > Columbia (0.14)
- North America > Cuba (0.04)
- North America > United States > Georgia > Chatham County > Savannah (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Education > Curriculum > Subject-Specific Education (0.64)
- Energy (0.46)