Integrating Episodic Memory into a Reinforcement Learning Agent using Reservoir Sampling

Young, Kenny J., Sutton, Richard S., Yang, Shuo

arXiv.org Machine Learning 

Episodic memory is a psychology term which refers to the ability to recall specific events from the past. We suggest one advantage of this particular type of memory is the ability to easily assign credit to a specific state when remembered information is found to be useful. Inspired by this idea, and the increasing popularity of external memory mechanisms to handle long-term dependencies in deep learning systems, we propose a novel algorithm which uses a reservoir sampling procedure to maintain an external memory consisting of a fixed number of past states. The algorithm allows a deep reinforcement learning agent to learn online to preferentially remember those states which are found to be useful to recall later on. Critically this method allows for efficient online computation of gradient estimates with respect to the write process of the external memory. Thus unlike most prior mechanisms for external memory it is feasible to use in an online reinforcement learning setting. Much of reinforcement learning (RL) theory is based on the assumption that the environment has the Markov property, meaning that future states are independent of past states given the present state. This implies the agent has all the information it needs to make an optimal decision at each time and therefore has no need to remember the past. This is however not realistic in general, realistic problems often require significant information from the past to make an informed decision in the present, and there is often no obvious way to incorporate the relevant information into an expanded present state.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found