Distributional Reinforcement Learning for Efficient Exploration

Mavrin, Borislav, Zhang, Shangtong, Yao, Hengshuai, Kong, Linglong, Wu, Kaiwen, Yu, Yaoliang

May-13-2019–arXiv.org Machine Learning

In distributional reinforcement learning (RL), the estimated distribution of value function models both the parametric and intrinsic uncertainties. We propose a novel and efficient exploration method for deep RL that has two components. The first is a decaying schedule to suppress the intrinsic uncertainty. The second is an exploration bonus calculated from the upper quantiles of the learned distribution. In Atari 2600 games, our method outperforms QR-DQN in 12 out of 14 hard games (achieving 483 \% average gain across 49 games in cumulative rewards over QR-DQN with a big win in Venture). We also compared our algorithm with QR-DQN in a challenging 3D driving simulator (CARLA). Results show that our algorithm achieves near-optimal safety rewards twice faster than QRDQN.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

May-13-2019

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Alberta (0.14)
  - United States > California (0.14)

Genre:
- Research Report > New Finding (0.88)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found