Depth and nonlinearity induce implicit exploration for RL

Dauparas, Justas, Tomioka, Ryota, Hofmann, Katja

May-29-2018–arXiv.org Artificial Intelligence

Reinforcement learning (RL) is a systematic approach to learning in sequential decision problems, where a learners' future task performance depends on its past actions. In such settings, learners have to explore, meaning they have to take actions with uncertain outcomes, to facilitate learning about the consequences of such actions. The question of how to best explore is a key open question in RL. Here, we specifically address this question from an empirical perspective, and investigate how to explore in a way that leads to sample efficient learning in deep RL, i.e., reinforcement learning with value function approximators that are parameterized as powerful neural networks. We present a surprising finding: in this setting, good approximate value functions can be learned without any explicit exploration. In fact, we find that in several cases learning without explicit exploration is equally or more sample efficient than the most-commonly used ɛ-greedy exploration scheme on several standard benchmark tasks. We present additional results that suggest a likely role of model structure (network depth and nonlinearity) in inducing such implicit exploration. We believe that our insights have strong practical implications and open up a novel line of research towards understanding exploration in deep RL.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

May-29-2018

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Reinforcement Learning (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found