Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

Carlos Riquelme, Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly, Timothy A. Mann, Andre Barreto, Gergely Neu

Neural Information Processing Systems 

In reinforcement learning (RL) an agent must learn how to behave while interacting with an environment. This challenging problem is usually formalized as the search for a decision policy-- i.e.,a

Similar Docs  Excel Report  more

TitleSimilaritySource
None found