Bayes-Adaptive Simulation-based Search with Value Function Approximation Arthur Guez,1,2 Nicolas Heess 2 David Silver 2 Peter Dayan

Mar-13-2024, 10:12:14 GMT–Neural Information Processing Systems

Bayes-adaptive planning offers a principled solution to the explorationexploitation trade-off under model uncertainty. It finds the optimal policy in belief space, which explicitly accounts for the expected effect on future rewards of reductions in uncertainty. However, the Bayes-adaptive solution is typically intractable in domains with large or continuous state spaces. We present a tractable method for approximating the Bayes-adaptive solution by combining simulationbased search with a novel value function approximation technique that generalises appropriately over belief space. Our method outperforms prior approaches in both discrete bandit tasks and simple continuous navigation and control tasks.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Mar-13-2024, 10:12:14 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > Massachusetts (0.14)

Genre:
- Research Report (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning
    - Belief Revision (0.69)
    - Planning & Scheduling (0.68)
    - Search (0.95)
    - Uncertainty > Fuzzy Logic (0.63)