sunberg
Sunberg
Online solvers for partially observable Markov decision processes have been applied to problems with large discrete state spaces, but continuous state, action, and observation spaces remain a challenge. This paper begins by investigating double progressive widening (DPW) as a solution to this challenge. However, we prove that this modification alone is not sufficient because the belief representations in the search tree collapse to a single particle causing the algorithm to converge to a policy that is suboptimal regardless of the computation time. This paper proposes and evaluates two new algorithms, POMCPOW and PFT-DPW, that overcome this deficiency by using weighted particle filtering. Simulation results show that these modifications allow the algorithms to be successful where previous approaches fail.
Voronoi Progressive Widening: Efficient Online Solvers for Continuous Space MDPs and POMDPs with Provably Optimal Components
Lim, Michael H., Tomlin, Claire J., Sunberg, Zachary N.
Markov decision processes (MDPs) and partially observable MDPs (POMDPs) can effectively represent complex real-world decision and control problems. However, continuous space MDPs and POMDPs, i.e. those having continuous state, action and observation spaces, are extremely difficult to solve, and there are few online algorithms with convergence guarantees. This paper introduces Voronoi Progressive Widening (VPW), a general technique to modify tree search algorithms to effectively handle continuous or hybrid action spaces, and proposes and evaluates three continuous space solvers: VOSS, VOWSS, and VOMCPOW. VOSS and VOWSS are theoretical tools based on sparse sampling and Voronoi optimistic optimization designed to justify VPW-based online solvers. While previous algorithms have enjoyed convergence guarantees for problems with continuous state and observation spaces, VOWSS is the first with global convergence guarantees for problems that additionally have continuous action spaces. VOMCPOW is a versatile and efficient VPW-based algorithm that consistently outperforms POMCPOW and BOMCP in several simulation experiments.
Efficiency and Safety in Autonomous Vehicles Through Planning With Uncertainty
Sunberg, Zachary N. (Stanford University)
Autonomous vehicles are quickly becoming an important part of human society for transportation, monitoring, agriculture, and other applications. In these applications, there is a fundamental tradeoff between safety and efficiency that is especially salient when the autonomous vehicles interact directly with humans. A key to maintaining safety without sacrificing efficiency is dealing with uncertainty properly so that robots can be assertive when it is appropriate and careful in dangerous situations. The research that will be presented in my thesis uses the partially observable Markov decision process framework to approach this challenge, exploring several applications and proposing a new solution approach that is able to handle continuous action and observation spaces, a qualitative improvement over current methods.