The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Liu, Shuze, Chen, Shuhang, Zhang, Shangtong

Feb-5-2024–arXiv.org Artificial Intelligence

Stochastic approximation is a class of algorithms that update a vector iteratively, incrementally, and stochastically, including, e.g., stochastic gradient descent and temporal difference learning. One fundamental challenge in analyzing a stochastic approximation algorithm is to establish its stability, i.e., to show that the stochastic vector iterates are bounded almost surely. In this paper, we extend the celebrated Borkar-Meyn theorem for stability from the Martingale difference noise setting to the Markovian noise setting, which greatly improves its applicability in reinforcement learning, especially in those off-policy reinforcement learning algorithms with linear function approximation and eligibility traces. Central to our analysis is the diminishing asymptotic rate of change of a few functions, which is implied by both a form of strong law of large numbers and a commonly used V4 Lyapunov drift condition and trivially holds if the Markov chain is finite and irreducible.

assumption 6, lim, stochastic approximation, (11 more...)

arXiv.org Artificial Intelligence

Feb-5-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Alberta (0.14)
  - United States
    - New York (0.04)
    - Virginia > Albemarle County
      - Charlottesville (0.04)
    - Massachusetts > Middlesex County
      - Belmont (0.04)
- Europe > United Kingdom
  - England
    - Cambridgeshire > Cambridge (0.04)
    - Oxfordshire > Oxford (0.04)
- Asia > Japan
  - Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Statistical Learning > Gradient Descent (0.54)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.48)