PAC-Bayesian Model Selection for Reinforcement Learning
Fard, Mahdi M., Pineau, Joelle
–Neural Information Processing Systems
This paper introduces the first set of PAC-Bayesian bounds for the batch reinforcement learning problem in finite state spaces. These bounds hold regardless of the correctness of the prior distribution. We demonstrate how such bounds can be used for model-selection in control problems where prior information is available either on the dynamics of the environment, or on the value of actions. Our empirical results confirm that PAC-Bayesian model-selection is able to leverage prior distributions when they are informative and, unlike standard Bayesian RL approaches, ignores them when they are misleading.
Neural Information Processing Systems
Dec-31-2010
- Country:
- North America
- United States > Massachusetts
- Middlesex County > Cambridge (0.04)
- Hampshire County > Amherst (0.04)
- Canada > Quebec
- Montreal (0.14)
- United States > Massachusetts
- North America
- Genre:
- Research Report > New Finding (0.34)
- Industry:
- Education (0.34)