Direct value-approximation for factored MDPs

Dec-31-2002–Neural Information Processing Systems

We present a simple approach for computing reasonable policies for factored Markov decision processes (MDPs), when the optimal valuefunction can be approximated by a compact linear form. Our method is based on solving a single linear program that approximates thebest linear fit to the optimal value function. By applying an efficient constraint generation procedure we obtain an iterative solution method that tackles concise linear programs. This direct linear programming approach experimentally yields a significant reductionin computation time over approximate value-and policy-iteration methods (sometimes reducing several hours to a few seconds). However, the quality of the solutions produced by linear programming is weaker-usually about twice the approximation errorfor the same approximating class. Nevertheless, the speed advantage allows one to use larger approximation classes to achieve similar error in reasonable time. 1 Introduction Markov decision processes (MDPs) form a foundation for control in uncertain and stochastic environments and reinforcement learning. Standard methods such as value-iteration, policy-iteration and linear programming can be used to produce optimal control policies for MDPs that are expressed in explicit form; that is, the policy, value function and state transition model are all represented in a tabular manner that explicitly enumerates the state space.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Dec-31-2002

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.92)
  - Machine Learning
    - Learning Graphical Models (0.87)
    - Reinforcement Learning (0.66)

Duplicate Docs Excel Report

Title
Direct value-approximation for factored MDPs
Direct value-approximation for factored MDPs

Similar Docs Excel Report more

Title	Similarity	Source
None found