Gradient Descent Learns Linear Dynamical Systems

Hardt, Moritz, Ma, Tengyu, Recht, Benjamin

arXiv.org Machine Learning 

For a sequence model to be both expressive and parsimonious in its parameterization, it is crucial to equip the model with memory thus allowing its prediction at time t to depend on previously seen inputs. Recurrent neural networks form an expressive class of nonlinear sequence models. Through their many variants, such as long-short-term-memory [HS97], recurrent neural networks have seen remarkable empirical success in a broad range of domains. At the core, neural networks are typically trained using some form of (stochastic) gradient descent. Even though the training objective is non-convex, it is widely observed in practice that gradient descent quickly approaches a good set of model parameters.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found