Goto

Collaborating Authors

 prediction and control



Prediction and Control in Continual Reinforcement Learning

Neural Information Processing Systems

Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies. In this paper, we focus on value function estimation in continual reinforcement learning. We propose to decompose the value function into two components which update at different timescales: a value function, which holds general knowledge that persists over time, and a value function, which allows quick adaptation to new situations. We establish theoretical results showing that our approach is well suited for continual learning and draw connections to the complementary learning systems (CLS) theory from neuroscience. Empirically, this approach improves performance significantly on both prediction and control problems.


Online Learning and Control of Complex Dynamical Systems from Sensory Input

Neural Information Processing Systems

Identifying an effective model of a dynamical system from sensory data and using it for future state prediction and control is challenging. Recent data-driven algorithms based on Koopman theory are a promising approach to this problem, but they typically never update the model once it has been identified from a relatively small set of observation, thus making long-term prediction and control difficult for realistic systems, in robotics or fluid mechanics for example. This paper introduces a novel method for learning an embedding of the state space with linear dynamics from sensory data. Unlike previous approaches, the dynamics model can be updated online and thus easily applied to systems with non-linear dynamics in the original configuration space. The proposed approach is evaluated empirically on several classical dynamical systems and sensory modalities, with good performance on long-term prediction and control.


Memory-Efficient Learning of Stable Linear Dynamical Systems for Prediction and Control

Neural Information Processing Systems

Learning a stable Linear Dynamical System (LDS) from data involves creating models that both minimize reconstruction error and enforce stability of the learned representation. We propose a novel algorithm for learning stable LDSs. Using a recent characterization of stable matrices, we present an optimization method that ensures stability at every step and iteratively improves the reconstruction error using gradient directions derived in this paper. When applied to LDSs with inputs, our approach---in contrast to current methods for learning stable LDSs---updates both the state and control matrices, expanding the solution space and allowing for models with lower reconstruction error. We apply our algorithm in simulations and experiments to a variety of problems, including learning dynamic textures from image sequences and controlling a robotic manipulator. Compared to existing approaches, our proposed method achieves an \textit{orders-of-magnitude} improvement in reconstruction error and superior results in terms of control performance. In addition, it is \textit{provably} more memory efficient, with an $\mathcal{O}(n^2)$ space complexity compared to $\mathcal{O}(n^4)$ of competing alternatives, thus scaling to higher-dimensional systems when the other methods fail.


Unsupervised Learning of Lagrangian Dynamics from Images for Prediction and Control

Neural Information Processing Systems

Recent approaches for modelling dynamics of physical systems with neural networks enforce Lagrangian or Hamiltonian structure to improve prediction and generalization. However, when coordinates are embedded in high-dimensional data such as images, these approaches either lose interpretability or can only be applied to one particular example. We introduce a new unsupervised neural network model that learns Lagrangian dynamics from images, with interpretability that benefits prediction and control. The model infers Lagrangian dynamics on generalized coordinates that are simultaneously learned with a coordinate-aware variational autoencoder (VAE). The VAE is designed to account for the geometry of physical systems composed of multiple rigid bodies in the plane. By inferring interpretable Lagrangian dynamics, the model learns physical system properties, such as kinetic and potential energy, which enables long-term prediction of dynamics in the image space and synthesis of energy-based controllers.




Review for NeurIPS paper: Unsupervised Learning of Lagrangian Dynamics from Images for Prediction and Control

Neural Information Processing Systems

This paper makes it possible to learn Lagrangian dynamics from images and use them for energy-based control. This represents an important and significant advance for this fledgling new research subfield of physics-aware prediction, which might very well go on to prove important and significant in the coming years. I believe the reviewers are all in agreement on this point. However, by entering this new territory for physics-aware prediction, this paper has also exposed itself to interest from a broader community of readers and NeurIPS attendees who are familiar with the progress in image-based *intuitive physics* modeling and control methods over the last 5 years or so (R2 and R4 point to some such approaches). A lot of the difficulty in arriving at a reviewer consensus for this paper can be put down to the fact that its positioning is somewhat myopic and ignores this broader context, perhaps because the authors themselves might not be familiar with these approaches.


Prediction and Control in Continual Reinforcement Learning

Neural Information Processing Systems

Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies. In this paper, we focus on value function estimation in continual reinforcement learning. We propose to decompose the value function into two components which update at different timescales: a permanent value function, which holds general knowledge that persists over time, and a transient value function, which allows quick adaptation to new situations. We establish theoretical results showing that our approach is well suited for continual learning and draw connections to the complementary learning systems (CLS) theory from neuroscience. Empirically, this approach improves performance significantly on both prediction and control problems.


Online Learning and Control of Complex Dynamical Systems from Sensory Input

Neural Information Processing Systems

Identifying an effective model of a dynamical system from sensory data and using it for future state prediction and control is challenging. Recent data-driven algorithms based on Koopman theory are a promising approach to this problem, but they typically never update the model once it has been identified from a relatively small set of observation, thus making long-term prediction and control difficult for realistic systems, in robotics or fluid mechanics for example. This paper introduces a novel method for learning an embedding of the state space with linear dynamics from sensory data. Unlike previous approaches, the dynamics model can be updated online and thus easily applied to systems with non-linear dynamics in the original configuration space. The proposed approach is evaluated empirically on several classical dynamical systems and sensory modalities, with good performance on long-term prediction and control.