Action-Gap Phenomenon in Reinforcement Learning

Amir-massoud Farahmand

Neural Information Processing Systems 

Many practitioners of reinforcement learning problems have observed that oftentimes the performance of the agent reaches very close to the optimal performance even though the estimated (action-)value function is still far from the optimal one. The goal of this paper is to explain and formalize this phenomenon by introducing the concept of the action-gap regularity.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found