On the convergence of optimistic policy iteration for stochastic shortest path problem

Chen, Yuanlong

arXiv.org Machine Learning 

In this paper, we prove some convergence results of a special case of optimistic policy iteration algorithm for stochastic shortest path problem mentioned in [5] . We consider both Monte Carlo and TD(λ) methods for the policy evaluation step under the condition that termination state will eventually be reached almost surely.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found