Deep Exploration via Bootstrapped DQN

Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy

Jan-20-2025, 15:26:45 GMT–Neural Information Processing Systems

E cient exploration remains a major challenge for reinforcement learning (RL). Common dithering strategies for exploration, such as '-greedy, do not carry out temporally-extended (or deep) exploration; this can lead to exponentially larger data requirements. However, most algorithms for statistically e cient RL are not computationally tractable in complex environments. Randomized value functions o er a promising approach to e cient exploration with generalization, but existing algorithms are not compatible with nonlinearly parameterized value functions. As a first step towards addressing such contexts we develop bootstrapped DQN. We demonstrate that bootstrapped DQN can combine deep exploration with deep neural networks for exponentially faster learning than any dithering strategy. In the Arcade Learning Environment bootstrapped DQN substantially improves learning speed and cumulative performance across most games.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Jan-20-2025, 15:26:45 GMT

Conferences PDF

Add feedback

Genre:
- Research Report (0.67)

Industry:
- Education (0.49)
- Leisure & Entertainment > Games (0.47)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.89)
  - Reinforcement Learning (1.00)