Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning Jean-Bastien Grill Michal Valko Rémi Munos SequeL team, INRIA Lille - Nord Europe, France Google DeepMind, UK
–Neural Information Processing Systems
You are a robot and you live in a Markov decision process (MDP) with a finite or an infinite number of transitions from state-action to next states. You got brains and so you plan before you act. Luckily, your roboparents equipped you with a generative model to do some Monte-Carlo planning. The world is waiting for you and you have no time to waste. You want your planning to be efficient.
Neural Information Processing Systems
Mar-12-2024, 13:01:22 GMT
- Country:
- Europe
- France > Hauts-de-France
- Pas-de-Calais (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- France > Hauts-de-France
- North America > United States
- Massachusetts > Middlesex County
- Belmont (0.04)
- New Jersey > Mercer County
- Princeton (0.04)
- Massachusetts > Middlesex County
- Europe