Intermittently Observable Markov Decision Processes
Chen, Gongpu, Liew, Soung-Chang
–arXiv.org Artificial Intelligence
This paper investigates MDPs with intermittent state information. We consider a scenario where the controller perceives the state information of the process via an unreliable communication channel. The transmissions of state information over the whole time horizon are modeled as a Bernoulli lossy process. Hence, the problem is finding an optimal policy for selecting actions in the presence of state information losses. We first formulate the problem as a belief MDP to establish structural results. The effect of state information losses on the expected total discounted reward is studied systematically. Then, we reformulate the problem as a tree MDP whose state space is organized in a tree structure. Two finite-state approximations to the tree MDP are developed to find near-optimal policies efficiently. Finally, we put forth a nested value iteration algorithm for the finite-state approximations, which is proved to be faster than standard value iteration. Numerical results demonstrate the effectiveness of our methods.
arXiv.org Artificial Intelligence
Feb-22-2023
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > China
- Hong Kong (0.04)
- Europe > United Kingdom
- Genre:
- Research Report > New Finding (0.34)
- Technology: