Memory of recurrent networks: Do we compute it right?

Ballarin, Giovanni, Grigoryeva, Lyudmila, Ortega, Juan-Pablo

arXiv.org Artificial Intelligence 

Recurrent Neural Networks (RNNs) are among the most widely used machine learning tools for sequential data processing [Suts 14]. Despite the rising popularity of transformer deep neural architectures [Vasw 17, Gali 22, Acci 23], in particular, in natural language processing, RNNs remain more suitable in a significant range of real-time and online learning tasks that require handling one element of the sequence at a time. The key difference is that transformers are designed to process entire time sequences at once, using self-attention mechanisms to focus on particular entries of the input, while RNNs use hidden state spaces to retain a memory of previous elements in the input sequence, which makes memory one of the most important features of RNNs. Multiple attempts have been made in recent years to design quantitative measures and characterize memory in neural networks in general [Vers 20, Koyu 23] and their recurrent versions, in particular, [Havi 19, Li 21]. The notion of memory capacity (MC) in recurrent neural networks was first introduced in [Jaeg 02], with a particular focus on the so-called echo state networks (ESNs) [Matt 92, Matt 94, Jaeg 04], which are a popular family of RNNs within the reservoir computing (RC) strand of the literature that have shown to be universal approximants in various contexts [Grig 18, Gono 20b, Gono 21]. RC models are state-space systems whose state map parameters are randomly generated and which can be seen as RNNs with random inner neuron connection weights and a readout layer that is trained depending on the learning task of interest.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found