Monte Carlo POMDPs
–Neural Information Processing Systems
We present a Monte Carlo algorithm for learning to act in partially observable Markov decision processes (POMDPs) with real-valued state and action spaces. Our approach uses importance sampling for representing beliefs, and Monte Carlo approximation for belief propagation. A reinforcement learning algorithm, value iteration, is employed to learn value functions over belief states. Finally, a samplebased version of nearest neighbor is used to generalize across states. Initial empirical results suggest that our approach works well in practical applications.
Neural Information Processing Systems
Dec-31-2000
- Country:
- North America > United States
- Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Japan
- Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.48)
- Technology: