Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes

Pareigis, Stephan

Neural Information Processing Systems 

The optimal control problem reduces to a boundary value problem for a fully nonlinear second-order elliptic differential equation of Hamilton Jacobi-Bellman (HJB-) type. Numerical analysis provides multigrid methodsfor this kind of equation. In the case of Learning Control, however,the systems of equations on the various grid-levels are obtained using observed information (transitions and local cost). To ensure consistency, special attention needs to be directed toward thetype of time and space discretization during the observation. Analgorithm for multi-grid observation is proposed.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found