Deeper & Sparser Exploration
Grover, Divya, Dimitrakakis, Christos
We address the problem of efficient exploration by proposing a new meta algorithm in the context of model-based online planning for Bayesian Reinforcement Learning (BRL). We beat the state-of-the-art, while staying computationally faster, in some cases by two orders of magnitude. This is the first Optimism free BRL algorithm to beat all previous state-of-the-art in tabular RL. The main novelty is the use of a candidate policy generator, to generate long-term options in the belief tree, which allows us to create much sparser and deeper trees. We present results on many standard environments and empirically prove its performance.
Feb-7-2019
- Country:
- North America > United States
- Massachusetts (0.04)
- New Jersey (0.04)
- New York > New York County
- New York City (0.04)
- California > San Francisco County
- San Francisco (0.14)
- Europe
- Sweden (0.04)
- Germany > Baden-Württemberg
- Freiburg (0.04)
- North America > United States
- Genre:
- Research Report (1.00)
- Industry:
- Education (0.46)