Deep Successor Reinforcement Learning

Kulkarni, Tejas D., Saeedi, Ardavan, Gautam, Simanta, Gershman, Samuel J.

Jun-8-2016–arXiv.org Machine Learning

Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components -- a reward predictor and a successor map. The successor map represents the expected future state occupancy from any given state and the reward predictor maps states to scalar rewards. The value function of a state can be computed as the inner product between the successor map and the reward weights. In this paper, we present DSR, which generalizes SR within an end-to-end deep reinforcement learning framework. DSR has several appealing properties including: increased sensitivity to distal reward changes due to factorization of reward and world dynamics, and the ability to extract bottleneck states (subgoals) given successor maps trained under a random policy. We show the efficacy of our approach on two diverse environments given raw pixel observations -- simple grid-world domains (MazeBase) and the Doom game engine.

arxiv preprint arxiv, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

Jun-8-2016

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.70)

Industry:
- Leisure & Entertainment > Games (0.49)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found