Learning first-order Markov models for control

Abbeel, Pieter, Ng, Andrew Y.

Neural Information Processing Systems 

First-order Markov models have been successfully applied to many problems, forexample in modeling sequential data using Markov chains, and modeling control problems using the Markov decision processes (MDP) formalism. If a first-order Markov model's parameters are estimated from data, the standard maximum likelihood estimator considers only the first-order (single-step) transitions. But for many problems, the firstorder conditionalindependence assumptions are not satisfied, and as a result the higher order transition probabilities may be poorly approximated. Motivated by the problem of learning an MDP's parameters for control, we propose an algorithm for learning a first-order Markov model that explicitly takesinto account higher order interactions during training. Our algorithm uses an optimization criterion different from maximum likelihood, andallows us to learn models that capture longer range effects, but without giving up the benefits of using first-order Markov models. Our experimental results also show the new algorithm outperforming conventional maximumlikelihood estimation in a number of control problems where the MDP's parameters are estimated from data.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found