Prioritizing Samples in Reinforcement Learning with Reducible Loss

Jan-16-2025, 22:57:35 GMT–Neural Information Processing Systems

Most reinforcement learning algorithms take advantage of an experience replay buffer to repeatedly train on samples the agent has observed in the past. Not all samples carry the same amount of significance and simply assigning equal importance to each of the samples is a naïve strategy. In this paper, we propose a method to prioritize samples based on how much we can learn from a sample. We define the learn-ability of a sample as the steady decrease of the training loss associated with this sample over time. We develop an algorithm to prioritize samples with high learn-ability, while assigning lower priority to those that are hard-to-learn, typically caused by noise or stochasticity.

prioritizing sample, reducible loss, reinforcement learning, (2 more...)

Neural Information Processing Systems

Jan-16-2025, 22:57:35 GMT

Conferences Web Page

Add feedback

Country:
- Europe > Portugal > Braga > Braga (0.10)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)