Generalization in Representation Models via Random Matrix Theory: Application to Recurrent Networks

Moakher, Yessin, Tiomoko, Malik, Louart, Cosme, Liao, Zhenyu

arXiv.org Machine Learning 

We first study the generalization error of models that use a fixed feature representation (frozen intermediate layers) followed by a trainable readout layer. This setting encompasses a range of architectures, from deep random-feature models to echo-state networks (ESNs) with recurrent dynamics. Working in the high-dimensional regime, we apply Random Matrix Theory to derive a closed-form expression for the asymptotic generalization error. We then apply this analysis to recurrent representations and obtain concise formula that characterize their performance. Surprisingly, we show that a linear ESN is equivalent to ridge regression with an exponentially time-weighted (''memory'') input covariance, revealing a clear inductive bias toward recent inputs. Experiments match predictions: ESNs win in low-sample, short-memory regimes, while ridge prevails with more data or long-range dependencies. Our methodology provides a general framework for analyzing overparameterized models and offers insights into the behavior of deep learning networks.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found