Provably-Efficient Double Q-Learning

Weng, Wentao, Gupta, Harsh, He, Niao, Ying, Lei, Srikant, R.

Jul-9-2020–arXiv.org Machine Learning

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning. Our result builds upon an analysis for linear stochastic approximation based on Lyapunov equations and applies to both tabular setting and with linear function approximation, provided that the optimal policy is unique and the algorithms converge. We show that the asymptotic mean-squared error of Double Q-learning is exactly equal to that of Q-learning if Double Q-learning uses twice the learning rate of Q-learning and outputs the average of its two estimators. We also present some practical implications of this theoretical observation using simulations.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

Jul-9-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Michigan > Washtenaw County
    - Ann Arbor (0.14)
  - Illinois > Champaign County
    - Urbana (0.04)
- Europe > United Kingdom
  - England
    - Oxfordshire > Oxford (0.04)
    - Cambridgeshire > Cambridge (0.04)
- Asia > China
  - Beijing > Beijing (0.04)

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found