DeepMDP: Learning Continuous Latent Space Models for Representation Learning

Gelada, Carles, Kumar, Saurabh, Buckman, Jacob, Nachum, Ofir, Bellemare, Marc G.

Jun-6-2019–arXiv.org Machine Learning

Many reinforcement learning (RL) tasks provide the agent with high-dimensional observations that can be simplified into low-dimensional continuous states. To formalize this process, we introduce the concept of a DeepMDP, a parameterized latent space model that is trained via the minimization of two tractable losses: prediction of rewards and prediction of the distribution over next latent states. We show that the optimization of these objectives guarantees (1) the quality of the latent space as a representation of the state space and (2) the quality of the DeepMDP as a model of the environment. We connect these results to prior work in the bisimulation literature, and explore the use of a variety of metrics. Our theoretical findings are substantiated by the experimental result that a trained DeepMDP recovers the latent structure underlying high-dimensional observations on a synthetic environment. Finally, we show that learning a DeepMDP as an auxiliary task in the Atari 2600 domain leads to large performance improvements over model-free RL.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

Jun-6-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > Los Angeles County > Long Beach (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Neural Networks (1.00)
    - Reinforcement Learning (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found