On the convergence of optimistic policy iteration for stochastic shortest path problem

Open in new window