Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features

Jun-19-2026, 12:14:19 GMT–Neural Information Processing Systems

Linear TD(λ) is one of the most fundamental reinforcement learning algorithms for policy evaluation. Previously, convergence rates are typically established under the assumption of linearly independent features, which does not hold in many practical scenarios. This paper instead establishes the first L2 convergence rates for linear TD(λ) operating under arbitrary features, without making any algorithmic modification or additional assumptions. Our results apply to both the discounted and average-reward settings. To address the potential non-uniqueness of solutions resulting from arbitrary features, we develop a novel stochastic approximation result featuring convergence rates to the solution set instead of a single point.

approximation, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Jun-19-2026, 12:14:19 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report > Experimental Study (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found