Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks

Livi, Lorenzo

arXiv.org Artificial Intelligence 

--We study how gating mechanisms in recurrent neural networks (RNNs) implicitly induce adaptive learning-rate behavior, even when training is carried out with a fixed, global learning rate. This effect arises from the coupling between state-space time scales-parametrized by the gates-and parameter-space dynamics during gradient descent. By deriving exact Jacobians for leaky-integrator and gated RNNs, we obtain a first-order expansion that makes explicit how constant, scalar, and multi-dimensional gates reshape gradient propagation, modulate effective step sizes, and introduce anisotropy in parameter updates. These findings reveal that gates not only control information flow, but also act as data-driven preconditioners that adapt optimization trajectories in parameter space. Empirical simulations corroborate these claims: in several sequence tasks, we show that gates induce lag-dependent effective learning rates and directional concentration of gradient flow, with multi-gate models matching or exceeding the anisotropic structure produced by Adam. These results highlight that optimizer-driven and gate-driven adaptivity are complementary but not equivalent mechanisms. Overall, this work provides a unified dynamical systems perspective on how gating couples state evolution with parameter updates, explaining why gated architectures achieve robust trainability and stability in practice.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found