Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update

Lee, Su Young, Choi, Sungik, Chung, Sae-Young

May-31-2018–arXiv.org Machine Learning

We propose Episodic Backward Update - a new algorithm to boost the performance of a deep reinforcement learning agent by a fast reward propagation. In contrast to the conventional use of the experience replay with uniform random sampling, our agent samples a whole episode and successively propagates the value of a state to its previous states. Our computationally efficient recursive algorithm allows sparse and delayed rewards to propagate efficiently through all transitions of a sampled episode. We evaluate our algorithm on 2D MNIST Maze environment and 49 games of the Atari 2600 environment and show that our method improves sample efficiency with a competitive amount of computational cost.

artificial intelligence, reinforcement learning, transition, (14 more...)

arXiv.org Machine Learning

May-31-2018

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Games (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found