Approximate Kalman Filter Q-Learning for Continuous State-Space MDPs

Sep-26-2013–arXiv.org Machine Learning

We seek to learn an effective policy for a Markov Decision Process (MDP) with continuous states via Q-Learning. Given a set of basis functions over state action pairs we search for a corresponding set of linear weights that minimizes the mean Bellman residual. Our algorithm uses a Kalman filter model to estimate those weights and we have developed a simpler approximate Kalman filter model that outperforms the current state of the art projected TD-Learning methods on several standard benchmark problems.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

Sep-26-2013

arXiv.org PDF

Add feedback

Country:
- North America > United States > California > Santa Clara County (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found