Symbolic Opportunistic Policy Iteration for Factored-Action MDPs

Raghavan, Aswin, Khardon, Roni, Fern, Alan, Tadepalli, Prasad

Feb-14-2020, 18:27:50 GMT–Neural Information Processing Systems

We address the scalability of symbolic planning under uncertainty with factored states and actions. Prior work has focused almost exclusively on factored states but not factored actions, and on value iteration (VI) compared to policy iteration (PI). Our first contribution is a novel method for symbolic policy backups via the application of constraints, which is used to yield a new efficient symbolic imple- mentation of modified PI (MPI) for factored action spaces. While this approach improves scalability in some cases, naive handling of policy constraints comes with its own scalability issues. This leads to our second and main contribution, symbolic Opportunistic Policy Iteration (OPI), which is a novel convergent al- gorithm lying between VI and MPI.

factored-action mdp, opportunistic policy iteration, symbolic opportunistic policy iteration, (3 more...)

Neural Information Processing Systems

Feb-14-2020, 18:27:50 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report (0.63)

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.64)