Provably efficient RL with Rich Observations via Latent State Decoding

Du, Simon S., Krishnamurthy, Akshay, Jiang, Nan, Agarwal, Alekh, Dudík, Miroslav, Langford, John

Jan-25-2019–arXiv.org Machine Learning

We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states. Under certain identifiability assumptions, we demonstrate how to estimate a mapping from the observations to latent states inductively through a sequence of regression and clustering steps---where previously decoded latent states provide labels for later regression problems---and use it to construct good exploration policies. We provide finite-sample guarantees on the quality of the learned state decoding function and exploration policies, and complement our theory with an empirical evaluation on a class of hard exploration problems. Our method exponentially improves over $Q$-learning with na\"ive exploration, even when $Q$-learning has cheating access to latent states.

artificial intelligence, latent state, neural network, (20 more...)

arXiv.org Machine Learning

Jan-25-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.27)

Genre:
- Research Report (0.63)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.48)
    - Neural Networks (0.93)
    - Reinforcement Learning (0.68)
    - Statistical Learning (0.68)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found