α-min: A Compact Approximate Solver For Finite-Horizon POMDPs
Dujardin, Yann (Commonwealth Scientific and Industrial Research Organisation (CSIRO)) | Dietterich, Tom (Oregon State University) | Chades, Iadine (Commonwealth Scientific and Industrial Research Organisation (CSIRO))
In many POMDP applications in computational sustainability, it is important that the computed policy have a simple description, so that it can be easily interpreted by stakeholders and decision makers. One measure of simplicity for POMDP value functions is the number of alpha-vectors required to represent the value function. Existing POMDP methods seek to optimize the accuracy of the value function, which can require a very large number of alpha-vectors. This paper studies methods that allow the user to explore the tradeoff between the accuracy of the value function and the number of alpha-vectors. Building on previous point-based POMDP solvers, this paper introduces a new algorithm (alpha-min) that formulates a Mixed Integer Linear Program (MILP) to calculate approximate solutions for finite-horizon POMDP problems with limited numbers of alpha-vectors. At each time-step, alpha-min calculates alpha-vectors to greedily minimize the gap between current upper and lower bounds of the value function. In doing so, good upper and lower bounds are quickly reached allowing a good approximation of the problem with few alpha-vectors . Experimental results show that alpha-min provides good approximate solutions given a fixed number of alpha-vectors on small benchmark problems, on a larger randomly generated problem, as well as on a computational sustainability problem to best manage the endangered Sumatran tiger.
Jul-15-2015
- Country:
- North America > United States
- Oregon (0.04)
- New Jersey
- Mercer County > Princeton (0.04)
- Hudson County > Hoboken (0.04)
- Europe > Switzerland
- North America > United States
- Genre:
- Research Report (0.54)
- Overview (0.48)
- Technology: