Finite-Time Analysis for Double Q-learning

Neural Information Processing Systems 

Theoretical performance of Q-learning has also been intensively explored. The asymptotic convergence has been established in Tsitsiklis (1994); Jaakkola et al. (1994); Borkar and Meyn (2000); Melo (2001); Lee and He (2019).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found