Models and algorithms for skip-free Markov decision processes on trees

Nov-8-2013–arXiv.org Artificial Intelligence

Markov decision processes (MDPs) provide a class of stochastic optimisation models that have found wide applicability to problems in Operational Research. The standard methods for computing an optimal policy are based on value iteration, policy iteration and linear programming algorithms (White 1993). Each approach has its advantages and disadvantages. In particular, each step in value iteration is relatively computationally inexpensive but the value function may take some time to converge and the algorithm provides no direct check that it has computed the optimal value function and an optimal policy. Conversely, each step in policy iteration may be computationally expensive but the algorithm can be proved to converge in a finite number of steps, confirms when it has converged and automatically identifies the optimal value function and an optimal policy on exit.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Nov-8-2013

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Learning Graphical Models
    - Undirected Networks > Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found