On Value Iteration Convergence in Connected MDPs

Mustafin, Arsenii, Olshevsky, Alex, Paschalidis, Ioannis Ch.

arXiv.org Artificial Intelligence 

This paper establishes that an MDP with a unique optimal policy and ergodic associated transition matrix ensures the convergence of various versions of the Value Iteration algorithm at a geometric rate that exceeds the discount factor {\gamma} for both discounted and average-reward criteria.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found