Finite-Time Analysis of Asynchronous Q-Learning with Discrete-Time Switching System Models

Feb-19-2021–arXiv.org Artificial Intelligence

This paper develops a novel framework to analyze the convergence of Q-learning algorithm from a discrete-time switching system perspective. We prove that asynchronous Q-learning with a constant step-size can be naturally formulated as discrete-time stochastic switched linear systems. It offers novel and intuitive insights on Q-learning mainly based on control theoretic frameworks. For instance, the proposed analysis explains the overestimation phenomenon in Q-learning due to the maximization bias. Based on the control system theoretic argument and some nice structures of Q-learning, a new finite-time analysis of the Q-learning is given with a novel error bound.

comparison system, convergence, q-learning, (14 more...)

arXiv.org Artificial Intelligence

Feb-19-2021

arXiv.org PDF

Add feedback

Country:
- Asia
  - Middle East > Jordan (0.04)
  - South Korea > Daejeon
    - Daejeon (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found