A Finite-Time Analysis of TD Learning with Linear Function Approximation without Projections nor Strong Convexity

Jun-3-2025–arXiv.org Machine Learning

We investigate the finite-time convergence properties of Temporal Difference (TD) learning with linear function approximation, a cornerstone algorithm in reinforcement learning. While prior work has established convergence guarantees, these results typically rely on the assumption that each iterate is projected onto a bounded set or that the learning rate is set according to the unknown strong convexity constant -- conditions that are both artificial and do not match the current practice. In this paper, we challenge the necessity of such assumptions and present a refined analysis of TD learning. We show that the simple projection-free variant converges with a rate of $\tilde{\mathcal{O}}(\frac{||θ^*||^2_2}{\sqrt{T}})$, even in the presence of Markovian noise. Our analysis reveals a novel self-bounding property of the TD updates and exploits it to guarantee bounded iterates.

approximation, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

Jun-3-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East > Saudi Arabia (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Fuzzy Logic (0.61)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found