Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates
Carlos Riquelme, Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly, Timothy A. Mann, Andre Barreto, Gergely Neu
–Neural Information Processing Systems
In reinforcement learning (RL) an agent must learn how to behave while interacting with an environment. This challenging problem is usually formalized as the search for a decision policy-- i.e.,a
Neural Information Processing Systems
Oct-3-2025, 02:46:15 GMT