Provably Efficient Exploration for RL with Unsupervised Learning

Feng, Fei, Wang, Ruosong, Yin, Wotao, Du, Simon S., Yang, Lin F.

Mar-15-2020–arXiv.org Artificial Intelligence

We study how to use unsupervised learning for efficient exploration in reinforcement learning with rich observations generated from a small number of latent states. We present a novel algorithmic framework that is built upon two components: an unsupervised learning algorithm and a no-regret reinforcement learning algorithm. We show that our algorithm provably finds a near-optimal policy with sample complexity polynomial in the number of latent states, which is significantly smaller than the number of possible observations. Our result gives theoretical justification to the prevailing paradigm of using unsupervised learning for efficient exploration [tang2017exploration,bellemare2016unifying].

algorithm, probability, trajectory, (13 more...)

arXiv.org Artificial Intelligence

Mar-15-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - California > San Mateo County
    - Menlo Park (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.84)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Unsupervised or Indirectly Supervised Learning (1.00)
  - Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found