Efficient Inference in Markov Control Problems

Feb-14-2012–arXiv.org Artificial Intelligence

Efficient Inference in Markov Control ProblemsThomas Furmston Computer Science Department University College London London, WC1E 6BT David Barber Computer Science Department University College London London, WC1E 6BT Abstract Markov control algorithms that perform smooth, non-greedy updates of the policy have been shown to be very general and versatile, with policy gradient and Expectation Maximisation algorithms being particularly popular. For these algorithms, marginal inference of the reward weighted trajectory distribution is required to perform policy updates. We discuss a new exact inference algorithm for these marginals in the finite horizon case that is more efficient than the standard approach based on classical forward-backward recursions. We also provide a principled extension to infinite horizon Markov Decision Problems that explicitly accounts for an infinite horizon. This extension provides a novel algorithm for both policy gradients and Expectation Maximisation in infinite horizon problems. The state and action spaces can be either discrete or continuous. For a discount factorγ the reward is defined as R t(s t,a t) γ t 1 R (s t,a t) for a stationary reward R (s t,a t), whereγ [0, 1).

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Feb-14-2012

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom (0.54)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Learning Graphical Models (0.46)
  - Representation & Reasoning > Uncertainty (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found