Efficient Decision-Theoretic Target Localization
Dressel, Louis (Stanford University) | Kochenderfer, Mykel J. (Stanford University)
Partially observable Markov decision processes (POMDPs) offer a principled approach to control under uncertainty. However, POMDP solvers generally require rewards to depend only on the state and action. This limitation is unsuitable for information-gathering problems, where rewards are more naturally expressed as functions of belief. In this work, we consider target localization, an information-gathering task where an agent takes actions leading to informative observations and a concentrated belief over possible target locations. By leveraging recent theoretical and algorithmic advances, we investigate offline and online solvers that incorporate belief-dependent rewards. We extend SARSOP--a state-of-the-art offline solver--to handle belief-dependent rewards, exploring different reward strategies and showing how they can be compactly represented. We present an improved lower bound that greatly speeds convergence. POMDP-lite, an online solver, is also evaluated in the context of information-gathering tasks. These solvers are applied to control a hex-copter UA V searching for a radio frequency source--a challenging real-world problem.
Jun-14-2017
- Country:
- North America > United States
- Florida > Hillsborough County
- Tampa (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Florida > Hillsborough County
- Europe
- Switzerland > Zürich
- Zürich (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Switzerland > Zürich
- North America > United States
- Industry:
- Aerospace & Defense (0.46)
- Technology: