Generative Temporal Models with Memory

Gemici, Mevlana, Hung, Chia-Chun, Santoro, Adam, Wayne, Greg, Mohamed, Shakir, Rezende, Danilo J., Amos, David, Lillicrap, Timothy

arXiv.org Machine Learning 

We consider the general problem of modeling temporal data with long-range dependencies, wherein new observations are fully or partially predictable based on temporally-distant, past observations. A sufficiently powerful temporal model should separate predictable elements of the sequence from unpredictable elements, express uncertainty about those unpredictable elements, and rapidly identify novel elements that may help to predict the future. To create such models, we introduce Generative T emporal Modelsaugmented with external memory systems. They are developed within the variational inference framework, which provides both a practical training methodology and methods to gain insight into the models' operation. We show, on a range of problems with sparse, long-term temporal dependencies, that these models store information from early in a sequence, and reuse this stored information efficiently. This allows them to perform substantially better than existing models based on well-known recurrent neural networks, like LSTMs. Many of the data sets we use in machine learning applications are sequential, whether these be natural language and speech processing data, streams of high-definition video, longitudinal time-series from medical diagnostics, or spatiotemporal data in climate forecasting. Generative Temporal Models (GTMs) are a core requirement for these applications. Generative Temporal Models are also important components of intelligent agents, as they permit counterfactual reasoning, physical predictions, robot localisation, and simulation-based planning among other capacities (Sutton, 1991; Deisenroth and Rasmussen, 2011; Watter et al., 2015; Levine and Abbeel, 2014; Assael et al., 2015). These tasks require models of high-dimensional observation sequences and contain complex, long temporal dependencies--requirements that most available GTMs are unable to fulfil. Developing such GTMs is the aim of this paper. Many GTMs--whether they are linear or nonlinear, deterministic or stochastic--assume that the underlying temporal dynamics is governed by low-order Markov transitions and use fixed-dimensional sufficient statistics. Examples of such models include Hidden Markov Models (Rabiner, 1989), and linear dynamical systems such as Kalman filters and their nonlinear extensions (Kalman, 1960; Ghahramani and Hinton, 1996; Krishnan et al., 2015). The fixed-order Markov assumption used in these models is insufficient for characterising many systems of practical relevance.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found