Minimax-Bayes Reinforcement Learning
Buening, Thomas Kleine, Dimitrakakis, Christos, Eriksson, Hannes, Grover, Divya, Jorge, Emilio
–arXiv.org Artificial Intelligence
While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution. One idea is to employ a worst-case prior. However, this is not as easy to specify in sequential decision making as in simple statistical estimation problems. This paper studies (sometimes approximate) minimax-Bayes solutions for various reinforcement learning problems to gain insights into the properties of the corresponding priors and policies. We find that while the worst-case prior depends on the setting, the corresponding minimax policies are more robust than those that assume a standard (i.e. uniform) prior.
arXiv.org Artificial Intelligence
Feb-21-2023
- Country:
- North America > United States
- Massachusetts (0.04)
- Europe
- Switzerland > Neuchâtel
- Neuchâtel (0.04)
- Spain > Valencian Community
- Valencia Province > Valencia (0.04)
- Norway > Eastern Norway
- Oslo (0.04)
- Switzerland > Neuchâtel
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.84)
- Industry:
- Education (0.34)