Prioritizing Samples in Reinforcement Learning with Reducible Loss

Neural Information Processing Systems 

Most reinforcement learning algorithms take advantage of an experience replay buffer to repeatedly train on samples the agent has observed in the past. Not all samples carry the same amount of significance and simply assigning equal importance to each of the samples is a naive strategy. In this paper, we propose a method to prioritize samples based on how much we can learn from a sample.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found