Finite-Time Analysis of Temporal Difference Learning: Discrete-Time Linear System Perspective

Jun-2-2023–arXiv.org Artificial Intelligence

TD-learning is a fundamental algorithm in the field of reinforcement learning (RL), that is employed to evaluate a given policy by estimating the corresponding value function for a Markov decision process. While significant progress has been made in the theoretical analysis of TD-learning, recent research has uncovered guarantees concerning its statistical efficiency by developing finite-time error bounds. This paper aims to contribute to the existing body of knowledge by presenting a novel finite-time analysis of tabular temporal difference (TD) learning, which makes direct and effective use of discrete-time stochastic linear system models and leverages Schur matrix properties. The proposed analysis can cover both on-policy and off-policy settings in a unified manner. By adopting this approach, we hope to offer new and straightforward templates that not only shed further light on the analysis of TD-learning and related RL algorithms but also provide valuable insights for future research in this domain.

machine learning, reinforcement learning, td-learning, (18 more...)

arXiv.org Artificial Intelligence

Jun-2-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County > Belmont (0.04)
- Europe > United Kingdom
  - England
    - Cambridgeshire > Cambridge (0.28)
    - Oxfordshire > Oxford (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - South Korea > Daejeon
    - Daejeon (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found