Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes

Dec-31-1997–Neural Information Processing Systems

The optimal control problem reduces to a boundary value problem for a fully nonlinear second-order elliptic differential equation of Hamilton Jacobi-Bellman (HJB-) type. Numerical analysis provides multigrid methodsfor this kind of equation. In the case of Learning Control, however,the systems of equations on the various grid-levels are obtained using observed information (transitions and local cost). To ensure consistency, special attention needs to be directed toward thetype of time and space discretization during the observation. Analgorithm for multi-grid observation is proposed. The multi-grid algorithm is demonstrated on a simple queuing problem. 1 Introduction Controlled Diffusion Processes (CDP) are the analogy to Markov Decision Problems in continuous state space and continuous time.

algorithm, artificial intelligence, reinforcement learning, (16 more...)

Neural Information Processing Systems

Dec-31-1997

Conferences PDF

Add feedback

Country:
- Europe > Germany (0.14)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Duplicate Docs Excel Report

Title
Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes
Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes

Similar Docs Excel Report more

Title	Similarity	Source
None found