From Weighted Classification to Policy Search

Dec-31-2006–Neural Information Processing Systems

This paper proposes an algorithm to convert a T -stage stochastic decision problem with a continuous state space to a sequence of supervised learning problems.The optimization problem associated with the trajectory tree and random trajectory methods of Kearns, Mansour, and Ng, 2000, is solved using the Gauss-Seidel method. The algorithm breaks a multistage reinforcementlearning problem into a sequence of single-stage reinforcement learningsubproblems, each of which is solved via an exact reduction to a weighted-classification problem that can be solved using off-the-self methods. Thus the algorithm converts a reinforcement learning probleminto simpler supervised learning subproblems. It is shown that the method converges in a finite number of steps to a solution that cannot be further improved by componentwise optimization. The implication ofthe proposed algorithm is that a plethora of classification methods can be applied to find policies in the reinforcement learning problem.

artificial intelligence, realization, reinforcement learning, (16 more...)

Neural Information Processing Systems

Dec-31-2006

Conferences PDF

Add feedback

Country:
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Industry:
- Education > Focused Education > Special Education (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)

Duplicate Docs Excel Report

Title
From Weighted Classification to Policy Search
From Weighted Classification to Policy Search

Similar Docs Excel Report more

Title	Similarity	Source
None found