Action-Gap Phenomenon in Reinforcement Learning School of Computer Science, McGill University Montreal, Quebec, Canada

Mar-14-2024, 21:33:33 GMT–Neural Information Processing Systems

Many practitioners of reinforcement learning problems have observed that oftentimes the performance of the agent reaches very close to the optimal performance even though the estimated (action-)value function is still far from the optimal one. The goal of this paper is to explain and formalize this phenomenon by introducing the concept of the action-gap regularity.

action-gap regularity, action-value function, performance loss, (14 more...)

Neural Information Processing Systems

Mar-14-2024, 21:33:33 GMT

Conferences PDF

Add feedback

Country:
- North America
  - United States > New York
    - New York County > New York City (0.04)
  - Canada > Quebec
    - Montreal (0.86)
- Asia > Middle East
  - Israel > Haifa District > Haifa (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)