Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation

Durmus, Alain, Moulines, Eric, Naumov, Alexey, Samsonov, Sergey

Mar-29-2023–arXiv.org Artificial Intelligence

The LSA algorithm is central in statistics, machine learning, and linear systems identification, see e.g. the works Eweda and Macchi (1983); Widrow and Stearns (1985); Benveniste et al. (2012); Kushner and Yin (2003) and references therein. More recently, it has sparked a renewed interest in machine learning, especially for high-dimensional least squares and reinforcement learning (RL) problems; Bertsekas and Tsitsiklis (2003); Bottou et al. (2018); Sutton (1988); Bertsekas (2019); Watkins and Dayan (1992). The LSA and LSA-PR recursions (1) have been the subject of a wealth of work, and it is difficult to adequately acknowledge all contributions. Polyak and Juditsky (1992); Kushner and Yin (2003); Borkar (2008); Benveniste et al. (2012) provided asymptotic convergence guarantees (almost sure convergence, central limit theorem) under both i.i.d. and Markovian noise settings. In particular, it has been established that LSA-PR can accelerate LSA and satisfies a central limit theorem with an asymptotically minimax-optimal covariance matrix. Although asymptotic convergence analysis is of theoretical interest, the current trend is to obtain nonasymptotic guarantees that take into account both the limited sample size and the dimension of the parameter space. For these reasons, non-asymptotic analysis of both i.i.d. and Markovian SA procedures has recently attracted much attention.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

Mar-29-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe
  - Russia (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Switzerland > Basel-City
    - Basel (0.04)
- Asia
  - Russia (0.04)
  - Middle East > Jordan (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Reinforcement Learning (0.87)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found