Sparse tree search optimality guarantees in POMDPs with continuous observation spaces

Lim, Michael H., Tomlin, Claire J., Sunberg, Zachary N.

arXiv.org Machine Learning 

Several online tree search techniques have been proposed to solve fully observable Markov decision processes with continuous state spaces, most prominently Sparse-UCT (Bjarnason et al., 2009), and double progressive widening (Cou etoux et al., 2011). There have also been several approaches for solving POMDPs or belief-space MDPs with continuous observation spaces. For example, Monte Carlo Value Iteration (MCVI) can use a classifier to deal with continuous observation spaces (Bai et al., 2014). Others partition the observation space (Hoey and Poupart, 2005) or assume that the most likely observation is always received (Platt et al., 2010). Other approaches are based on motion planning (Melchior and Simmons, 2007; Prentice and Roy, 2009; Bry and Roy, 2011; Agha-Mohammadi et al., 2011), locally optimizing pre-computed trajectories (Van Den Berg et al., 2012), or optimizing open-loop plans (Sunberg et al., 2013). McAllester and Singh (1999) also extend the sparse sampling algorithm of Kearns et al. (2002), but they use a belief simplification scheme instead of the particle sampling scheme used in this work.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found