VDCBPI: an Approximate Scalable Algorithm for Large POMDPs
Poupart, Pascal, Boutilier, Craig
–Neural Information Processing Systems
Existing algorithms for discrete partially observable Markov decision processes can at best solve problems of a few thousand states due to two important sources of intractability: the curse of dimensionality and the policy space complexity. This paper describes a new algorithm (VDCBPI) that mitigates both sources of intractability by combining the Value Directed Compression (VDC) technique [13] with Bounded Policy Iteration (BPI) [14]. The scalability of VDCBPI is demonstrated on synthetic network management problems with up to 33 million states.
Neural Information Processing Systems
Dec-31-2005
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- Mexico (0.04)
- United States
- Wisconsin > Dane County
- Madison (0.04)
- Washington > King County
- Seattle (0.04)
- Rhode Island > Providence County
- Providence (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > Santa Clara County
- San Jose (0.04)
- Wisconsin > Dane County
- Canada
- Ontario > Toronto (0.29)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Europe
- Sweden > Stockholm
- Stockholm (0.04)
- Spain > Castilla-La Mancha
- Toledo Province > Toledo (0.04)
- Sweden > Stockholm
- Oceania > Australia
- Technology: