Entropy-Regularized Partially Observed Markov Decision Processes
Molloy, Timothy L., Nair, Girish N.
–arXiv.org Artificial Intelligence
We investigate partially observed Markov decision processes (POMDPs) with cost functions regularized by entropy terms describing state, observation, and control uncertainty. Standard POMDP techniques are shown to offer bounded-error solutions to these entropy-regularized POMDPs, with exact solutions when the regularization involves the joint entropy of the state, observation, and control trajectories. Our joint-entropy result is particularly surprising since it constitutes a novel, tractable formulation of active state estimation. Partially observed Markov decision processes (POMDPs) and Markov decision processes (MDPs) with information-theoretic costs have attracted widespread attention across the technical disciplines of systems and control [2]-[5], computer science [6]-[8], signal processing [9]-[12], and robotics [13]-[15]. Interest in such POMDPs has been driven, in large part, by active state estimation problems in which informationtheoretic costs describing the uncertainty about latent states are minimized in order to aid or enhance the performance of state estimation algorithms [5], [6], [9], [10].
arXiv.org Artificial Intelligence
Dec-22-2021
- Country:
- Oceania > Australia (0.14)
- North America > United States
- New York (0.04)
- Massachusetts > Middlesex County
- Belmont (0.04)
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- United Kingdom > England
- Genre:
- Research Report (0.82)
- Technology: