AITopics | egru

Collaborating Authors

egru

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Weight Sparsity Complements Activity Sparsity in Neuromorphic Language Models

Mukherji, Rishav, Schöne, Mark, Nazeer, Khaleelulla Khan, Mayr, Christian, Kappel, David, Subramoney, Anand

arXiv.org Artificial IntelligenceMay-1-2024

Activity and parameter sparsity are two standard methods of making neural networks computationally more efficient. Event-based architectures such as spiking neural networks (SNNs) naturally exhibit activity sparsity, and many methods exist to sparsify their connectivity by pruning weights. While the effect of weight pruning on feed-forward SNNs has been previously studied for computer vision tasks, the effects of pruning for complex sequence tasks like language modeling are less well studied since SNNs have traditionally struggled to achieve meaningful performance on these tasks. Using a recently published SNN-like architecture that works well on small-scale language modeling, we study the effects of weight pruning when combined with activity sparsity. Specifically, we study the trade-off between the multiplicative efficiency gains the combination affords and its effect on task performance for language modeling. To dissect the effects of the two sparsities, we conduct a comparative analysis between densely activated models and sparsely activated event-based models across varying degrees of connectivity sparsity. We demonstrate that sparse activity and sparse connectivity complement each other without a proportional drop in task performance for an event-based neural network trained on the Penn Treebank and WikiText-2 language modeling datasets. Our results suggest sparsely connected event-based neural networks are promising candidates for effective and efficient sequence modeling.

international conference, neural network, sparsity, (13 more...)

arXiv.org Artificial Intelligence

2405.00433

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Germany > Saxony > Dresden (0.04)
North America > United States > Tennessee > Knox County > Knoxville (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient recurrent architectures through activity sparsity and sparse back-propagation through time

Subramoney, Anand, Nazeer, Khaleelulla Khan, Schöne, Mark, Mayr, Christian, Kappel, David

arXiv.org Artificial IntelligenceMar-9-2023

Recurrent neural networks (RNNs) are well suited for solving sequence tasks in resource-constrained systems due to their expressivity and low computational requirements. However, there is still a need to bridge the gap between what RNNs are capable of in terms of efficiency and performance and real-world application requirements. The memory and computational requirements arising from propagating the activations of all the neurons at every time step to every connected neuron, together with the sequential dependence of activations, contribute to the inefficiency of training and using RNNs. We propose a solution inspired by biological neuron dynamics that makes the communication between RNN units sparse and discrete. This makes the backward pass with backpropagation through time (BPTT) computationally sparse and efficient as well. We base our model on the gated recurrent unit (GRU), extending it with units that emit discrete events for communication triggered by a threshold so that no information is communicated to other units in the absence of events. We show theoretically that the communication between units, and hence the computation required for both the forward and backward passes, scales with the number of events in the network. Our model achieves efficiency without compromising task performance, demonstrating competitive performance compared to state-of-the-art recurrent network models in real-world tasks, including language modeling. The dynamic activity sparsity mechanism also makes our model well suited for novel energy-efficient neuromorphic hardware. Code is available at https://github.com/KhaleelKhan/EvNN/.

artificial intelligence, machine learning, sparsity, (18 more...)

arXiv.org Artificial Intelligence

2206.06178

Country:

Asia > Middle East > Jordan (0.04)
Europe > Germany > Saxony > Dresden (0.04)
Europe > France (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.46)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Real Time Recurrent Learning through combined activity and parameter sparsity

Subramoney, Anand

arXiv.org Artificial IntelligenceMar-9-2023

Backpropagation through time (BPTT) is the standard algorithm for training recurrent neural networks (RNNs), which requires separate simulation phases for the forward and backward passes for inference and learning, respectively. Moreover, BPTT requires storing the complete history of network states between phases, with memory consumption growing proportional to the input sequence length. This makes BPTT unsuited for online learning and presents a challenge for implementation on low-resource real-time systems. Real-Time Recurrent Learning (RTRL) allows online learning, and the growth of required memory is independent of sequence length. However, RTRL suffers from exceptionally high computational costs that grow proportional to the fourth power of the state size, making RTRL computationally intractable for all but the smallest of networks. In this work, we show that recurrent networks exhibiting high activity sparsity can reduce the computational cost of RTRL. Moreover, combining activity and parameter sparsity can lead to significant enough savings in computational and memory costs to make RTRL practical. Unlike previous work, this improvement in the efficiency of RTRL can be achieved without using any approximations for the learning process.

artificial intelligence, machine learning, sparsity, (16 more...)

arXiv.org Artificial Intelligence

2303.05641

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Germany > North Rhine-Westphalia (0.04)

Genre: Research Report (0.41)

Technology:

Information Technology > Architecture (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

An Optimized Recurrent Unit for Ultra-Low-Power Keyword Spotting

Amoh, Justice, Odame, Kofi

arXiv.org Machine LearningFeb-13-2019

There is growing interest in being able to run neural networks on sensors, wearables and internet-of-things (IoT) devices. However, the computational demands of neural networks make them difficult to deploy on resource-constrained edge devices. To meet this need, our work introduces a new recurrent unit architecture that is specifically adapted for on-device low power acoustic event detection (AED). The proposed architecture is based on the gated recurrent unit (`GRU') but features optimizations that make it implementable on ultra-low power micro-controllers such as the Arm Cortex M0+. Our new architecture, the Embedded Gated Recurrent Unit (eGRU) is demonstrated to be highly efficient and suitable for short-duration AED and keyword spotting tasks. A single eGRU cell is 60x faster and 10x smaller than a GRU cell. Despite its optimizations, eGRU compares well with GRU across tasks of varying complexities. The practicality of eGRU is investigated in a wearable acoustic event detection application. An eGRU model is implemented and tested on the Arm Cortex M0-based Atmel ATSAMD21E18 processor. The Arm M0+ implementation of the eGRU model compares favorably with a full precision GRU that is running on a workstation. The embedded eGRU model achieves a classification accuracy 95.3%, which is only 2% less than the full precision GRU.

architecture, egru, neural network, (16 more...)

arXiv.org Machine Learning

1902.05026

Country:

North America > United States > New Hampshire > Grafton County > Hanover (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.51)

Industry: Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback