AITopics | contractive stochastic approximation

Collaborating Authors

contractive stochastic approximation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Finite-Sample Analysis of Contractive Stochastic Approximation Using Smooth Convex Envelopes

Neural Information Processing SystemsDec-24-2025, 02:22:33 GMT

Stochastic Approximation (SA) is a popular approach for solving fixed-point equations where the information is corrupted by noise. In this paper, we consider an SA involving a contraction mapping with respect to an arbitrary norm, and show its finite-sample error bounds while using different stepsizes. The idea is to construct a smooth Lyapunov function using the generalized Moreau envelope, and show that the iterates of SA have negative drift with respect to that Lyapunov function. Our result is applicable in Reinforcement Learning (RL). In particular, we use it to establish the first-known convergence rate of the V-trace algorithm for off-policy TD-learning [18]. Importantly, our construction results in only a logarithmic dependence of the convergence bound on the size of the state-space.

contractive stochastic approximation, finite-sample analysis, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback

Review for NeurIPS paper: Finite-Sample Analysis of Contractive Stochastic Approximation Using Smooth Convex Envelopes

Neural Information Processing SystemsJan-24-2025, 19:48:45 GMT

Summary and Contributions: This paper considers the fundamental problem of computing a fixed point of a stochastic contractive operator. Formally, this paper supposes that there is an operator H from R d to R d such that H(x) – x_* _c \leq \gamma x – x_* _c for some norm \cdot _c, some gamma \in (0, 1), and fixed _x_* and that given any point x can compute H(x) w where w is mean zero noise, the magnitude of which depends on x. The paper then considers the natural stochastic approximation (SA) algorithm x_k 1 x_k epsilon_k (H(x_k) – x_k w_k) for some sequences step sizes epsilon_k and stochastic mean-0 w_k, the norm of which depends on x_k. The paper provides general convergence bounds for this algorithm in terms of the norm in which H contracts and the bound on the noise vector w_k in terms of x_k. Formally, the paper supposes that conditioned up to iteration k, E w_k 2_n \leq A (1 x_k _n 2) for some norm and provides bounds on the convergence of the method.

artificial intelligence, machine learning, reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

Finite-Sample Analysis of Contractive Stochastic Approximation Using Smooth Convex Envelopes

Neural Information Processing SystemsOct-10-2024, 07:38:31 GMT

contractive stochastic approximation, smooth convex envelope, stochastic approximation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

A Concentration Bound for TD(0) with Function Approximation

Chandak, Siddharth, Borkar, Vivek S.

arXiv.org Machine LearningDec-16-2023

We derive a concentration bound of the type `for all $n \geq n_0$ for some $n_0$' for TD(0) with linear function approximation. We work with online TD learning with samples from a single sample path of the underlying Markov chain. This makes our analysis significantly different from offline TD learning or TD learning with access to independent samples from the stationary distribution of the Markov chain. We treat TD(0) as a contractive stochastic approximation algorithm, with both martingale and Markov noises. Markov noise is handled using the Poisson equation and the lack of almost sure guarantees on boundedness of iterates is handled using the concept of relaxed concentration inequalities.

approximation, machine learning, reinforcement learning, (20 more...)

arXiv.org Machine Learning

2312.10424

Country:

Asia > India (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Essex (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Add feedback