Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation
Durmus, Alain, Moulines, Eric, Naumov, Alexey, Samsonov, Sergey
–arXiv.org Artificial Intelligence
The LSA algorithm is central in statistics, machine learning, and linear systems identification, see e.g. the works Eweda and Macchi (1983); Widrow and Stearns (1985); Benveniste et al. (2012); Kushner and Yin (2003) and references therein. More recently, it has sparked a renewed interest in machine learning, especially for high-dimensional least squares and reinforcement learning (RL) problems; Bertsekas and Tsitsiklis (2003); Bottou et al. (2018); Sutton (1988); Bertsekas (2019); Watkins and Dayan (1992). The LSA and LSA-PR recursions (1) have been the subject of a wealth of work, and it is difficult to adequately acknowledge all contributions. Polyak and Juditsky (1992); Kushner and Yin (2003); Borkar (2008); Benveniste et al. (2012) provided asymptotic convergence guarantees (almost sure convergence, central limit theorem) under both i.i.d. and Markovian noise settings. In particular, it has been established that LSA-PR can accelerate LSA and satisfies a central limit theorem with an asymptotically minimax-optimal covariance matrix. Although asymptotic convergence analysis is of theoretical interest, the current trend is to obtain nonasymptotic guarantees that take into account both the limited sample size and the dimension of the parameter space. For these reasons, non-asymptotic analysis of both i.i.d. and Markovian SA procedures has recently attracted much attention.
arXiv.org Artificial Intelligence
Mar-29-2023
- Country:
- Asia
- Middle East > Jordan (0.04)
- Russia (0.04)
- Europe
- Russia (0.04)
- Switzerland > Basel-City
- Basel (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America > United States (0.04)
- Asia
- Genre:
- Research Report (0.64)
- Technology: