echo state network
CogScale: Scalable Benchmark for Sequence Processing
Bendi-Ouis, Yannis, de Coudenhove, Romain, Hinaut, Xavier
The ability to maintain and manipulate information over time is a fundamental aspect of living beings and Artificial Intelligence. While modern models have achieved remarkable success in tasks like natural language processing, evaluating the capacity of novel architectures to process sequential information remains computationally expensive and time-consuming. Testing a new architecture often requires scaling up to massive datasets and models, leading to vast computational costs and slow iteration cycles. In this paper, we propose CogScale, a benchmark of 14 scalable synthetic tasks designed to isolate and evaluate specific cognitive and memory abilities at different parametrizable scales. By providing a standardized, lightweight framework, CogScale allows researchers to rapidly validate architectural innovations before committing to large-scale training. To establish a solid baseline, we evaluate seven distinct architectures: Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), xLSTM, Echo State Network (ESN), Mamba, Transformer Decoder, and Transformer Encoder-Decoder. These evaluations are conducted under strict parameter budgets (1k, 10k, and 100k) and across different difficulty levels and scales. Our results show that while classical RNNs and Echo State Networks excel at basic retention within strict parameter budgets, only attention mechanisms and modern state-space models consistently maintain high performance as reasoning complexity and task difficulty scale.
Echo Flow Networks
At the heart of time-series forecasting (TSF) lies a fundamental challenge: how can models efficiently and effectively capture long-range temporal dependencies across ever-growing sequences? While deep learning has brought notable progress, conventional architectures often face a trade-off between computational complexity and their ability to retain accumulative information over extended horizons. Echo State Networks (ESNs), a class of reservoir computing models, have recently regained attention for their exceptional efficiency, offering constant memory usage and per-step training complexity regardless of input length. This makes them particularly attractive for modeling extremely long-term event history in TSF. However, traditional ESNs fall short of state-of-the-art performance due to their limited nonlinear capacity, which constrains both their expressiveness and stability. We introduce Echo Flow Networks (EFNs), a framework composed of a group of extended Echo State Networks (X-ESNs) with MLP readouts, enhanced by our novel Matrix-Gated Composite Random Activation (MCRA), which enables complex, neuron-specific temporal dynamics, significantly expanding the network's representational capacity without compromising computational efficiency. In addition, we propose a dual-stream architecture in which recent input history dynamically selects signature reservoir features from an infinite-horizon memory, leading to improved prediction accuracy and long-term stability. Extensive evaluations on five benchmarks demonstrate that EFNs achieve up to 4x faster training and 3x smaller model size compared to leading methods like PatchTST, reducing forecasting error from 43% to 35%, a 20% relative improvement. One instantiation of our framework, EchoFormer, consistently achieves new state-of-the-art performance across five benchmark datasets: ETTh, ETTm, DMV, Weather, and Air Quality.
Generalization in Representation Models via Random Matrix Theory: Application to Recurrent Networks
Moakher, Yessin, Tiomoko, Malik, Louart, Cosme, Liao, Zhenyu
We first study the generalization error of models that use a fixed feature representation (frozen intermediate layers) followed by a trainable readout layer. This setting encompasses a range of architectures, from deep random-feature models to echo-state networks (ESNs) with recurrent dynamics. Working in the high-dimensional regime, we apply Random Matrix Theory to derive a closed-form expression for the asymptotic generalization error. We then apply this analysis to recurrent representations and obtain concise formula that characterize their performance. Surprisingly, we show that a linear ESN is equivalent to ridge regression with an exponentially time-weighted (''memory'') input covariance, revealing a clear inductive bias toward recent inputs. Experiments match predictions: ESNs win in low-sample, short-memory regimes, while ridge prevails with more data or long-range dependencies. Our methodology provides a general framework for analyzing overparameterized models and offers insights into the behavior of deep learning networks.
Sustainable NARMA-10 Benchmarking for Quantum Reservoir Computing
Kodali, Avyay, Singh, Priyanshi, Pandey, Pranay, Bhatia, Krishna, Devendrababu, Shalini, Ganguly, Srinjoy
Abstract--This study compares Quantum Reservoir Computing (QRC) with classical (Echo State Networks, LSTMs) and hybrid quantum-classical methods (QLSTM) for the nonlinear autoregressive moving average task (NARMA-10). We evaluate forecasting accuracy (NRMSE), computational cost, and evaluation time. Results show QRC achieves competitive accuracy while offering potential sustainability advantages, particularly in resource-constrained settings, highlighting its promise for sustainable, time-series AI applications. Time-series forecasting is fundamental across science and engineering; the NARMA-10 benchmark probes temporal memory and nonlinear processing. We present a systematic comparison of Quantum Reservoir Computing (QRC) [4] against classical and hybrid baselines--Echo State Networks (ESN) [1], Long Short-Term Memory (LSTM) [2], and a quantum-inspired LSTM (QLSTM)--on NARMA-10.
Modeling Biological Multifunctionality with Echo State Networks
Leventi-Peetz, Anastasia-Maria, Peetz, Jรถrg-Volker, Weber, Kai, Zacharis, Nikolaos
In this work, a three-dimensional multicomponent reaction-diffusion model has been developed, combining excitable-system dynamics with diffusion processes and sharing conceptual features with the FitzHugh-Nagumo model. Designed to capture the spatiotemporal behavior of biological systems, particularly electrophysiological processes, the model was solved numerically to generate time-series data. These data were subsequently used to train and evaluate an Echo State Network (ESN), which successfully reproduced the system's dynamic behavior. The results demonstrate that simulating biological dynamics using data-driven, multifunctional ESN models is both feasible and effective.
A Random Matrix Perspective of Echo State Networks: From Precise Bias--Variance Characterization to Optimal Regularization
Moakher, Yessin, Tiomoko, Malik, Louart, Cosme, Liao, Zhenyu
We present a rigorous asymptotic analysis of Echo State Networks (ESNs) in a teacher student setting with a linear teacher with oracle weights. Leveraging random matrix theory, we derive closed form expressions for the asymptotic bias, variance, and mean-squared error (MSE) as functions of the input statistics, the oracle vector, and the ridge regularization parameter. The analysis reveals two key departures from classical ridge regression: (i) ESNs do not exhibit double descent, and (ii) ESNs attain lower MSE when both the number of training samples and the teacher memory length are limited. We further provide an explicit formula for the optimal regularization in the identity input covariance case, and propose an efficient numerical scheme to compute the optimum in the general case. Together, these results offer interpretable theory and practical guidelines for tuning ESNs, helping reconcile recent empirical observations with provable performance guarantees
Empirical Investigation into Configuring Echo State Networks for Representative Benchmark Problem Domains
Weborg, Brooke R., Serpen, Gursel
This paper examines Echo State Network, a reservoir computer, performance using four different benchmark problems, then proposes heuristics or rules of thumb for configuring the architecture, as well as the selection of parameters and their values, which are applicable to problems within the same domain, to help serve to fill the experience gap needed by those entering this field of study. The influence of various parameter selections and their value adjustments, as well as architectural changes made to an Echo State Network, a powerful recurrent neural network configured as a reservoir computer, can be challenging to fully comprehend without experience in the field, and even some hyperparameter optimization algorithms may have difficulty adjusting parameter values without proper manual selections made first. Therefore, it is imperative to understand the effects of parameters and their value selection on Echo State Network architecture performance for a successful build. Thus, to address the requirement for an extensive background in Echo State Network architecture, as well as examine how Echo State Network performance is affected with respect to variations in architecture, design, and parameter selection and values, a series of benchmark tasks representing different problem domains, including time series prediction, pattern generation, chaotic system prediction, and time series classification, were modeled and experimented on to show the impact on the performance of Echo State Network.
Echo State Networks for Bitcoin Time Series Prediction
Sharma, Mansi, Sartor, Enrico, Cavazza, Marc, Prendinger, Helmut
Forecasting stock and cryptocurrency prices is challenging due to high volatility and non-stationarity, influenced by factors like economic changes and market sentiment. Previous research shows that Echo State Networks (ESNs) can effectively model short-term stock market movements, capturing nonlinear patterns in dynamic data. To the best of our knowledge, this work is among the first to explore ESNs for cryptocurrency forecasting, especially during extreme volatility. We also conduct chaos analysis through the Lyapunov exponent in chaotic periods and show that our approach outperforms existing machine learning methods by a significant margin. Our findings are consistent with the Lyapunov exponent analysis, showing that ESNs are robust during chaotic periods and excel under high chaos compared to Boosting and Naรฏve methods.
ReMi: A Random Recurrent Neural Network Approach to Music Production
Chateau-Laurent, Hugo, Vanhatalo, Tara, Pan, Wei-Tung, Hinaut, Xavier
W e show that randomly initialized recurrent neural networks can produce arpeggios and low-frequency oscillations that are rich and configurable. In contrast to end-to-end music generation that aims to replace musicians, our approach expands their creativity while requiring no data and much less computational power . More information can be found at: https://allendia.com/ 1. INTRODUCTION Artificial intelligence continues to drive significant changes in music production. However, current methods often require vast amounts of high-quality data, which are not always readily available.
A Method of Selective Attention for Reservoir Based Agents
Training of deep reinforcement learning agents is slowed considerably by the presence of input dimensions that do not usefully condition the reward function. Existing modules such as layer normalization can be trained with weight decay to act as a form of selective attention, i.e. an input mask, that shrinks the scale of unnecessary inputs, which in turn accelerates training of the policy. However, we find a surprising result that adding numerous parameters to the computation of the input mask results in much faster training. A simple, high dimensional masking module is compared with layer normalization and a model without any input suppression. The high dimensional mask resulted in a four-fold speedup in training over the null hypothesis and a two-fold speedup in training over the layer normalization method.