On the difficulty of learning chaotic dynamics with RNNs Jonas M. Mikhaeil 1,2,*, and Daniel Durstewitz
–Neural Information Processing Systems
Recurrent neural networks (RNNs) are wide-spread machine learning tools for modeling sequential and time series data. They are notoriously hard to train because their loss gradients backpropagated in time tend to saturate or diverge during training. This is known as the exploding and vanishing gradient problem. Previous solutions to this issue either built on rather complicated, purpose-engineered architectures with gated memory buffers, or - more recently - imposed constraints that ensure convergence to a fixed point or restrict (the eigenspectrum of) the recurrence matrix. Such constraints, however, convey severe limitations on the expressivity of the RNN.
Neural Information Processing Systems
Jan-27-2025, 01:19:07 GMT
- Country:
- Europe > Germany (0.28)
- North America > United States (0.46)
- Industry:
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- Technology: