AITopics | finite-sample analysis

Collaborating Authors

finite-sample analysis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

Neural Information Processing SystemsDec-27-2025, 04:13:37 GMT

In this work, we study two-player zero-sum stochastic games and develop a variant of the smoothed best-response learning dynamics that combines independent learning dynamics for matrix games with the minimax value iteration for stochastic games. The resulting learning dynamics are payoff-based, convergent, rational, and symmetric between the two players.

finite-sample analysis, independent learning, payoff-based independent learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.42)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Neural Information Processing SystemsDec-25-2025, 19:32:24 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.\ setting, where a single sample trajectory is available. With a Lipschitz continuous policy improvement operator that is smooth enough, SARSA has been shown to converge asymptotically. However, its non-asymptotic analysis is challenging and remains unsolved due to the non-i.i.d.

algorithm, finite-sample analysis, sarsa algorithm, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Neural Information Processing SystemsDec-24-2025, 19:06:43 GMT

In TD-learning, off-policy sampling is known to be more practical than on-policy sampling, and by decoupling learning from data collection, it enables data reuse. It is known that policy evaluation has the interpretation of solving a generalized Bellman equation. In this paper, we derive finite-sample bounds for any general off-policy TD-like stochastic approximation algorithm that solves for the fixed-point of this generalized Bellman operator. Our key step is to show that the generalized Bellman operator is simultaneously a contraction mapping with respect to a weighted $\ell_p$-norm for each $p$ in $[1,\infty)$, with a common contraction factor. Off-policy TD-learning is known to suffer from high variance due to the product of importance sampling ratios.

finite-sample analysis, generalized bellman operator, off-policy td-learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.65)

Add feedback

Finite-Sample Analysis of Contractive Stochastic Approximation Using Smooth Convex Envelopes

Neural Information Processing SystemsDec-24-2025, 02:22:33 GMT

Stochastic Approximation (SA) is a popular approach for solving fixed-point equations where the information is corrupted by noise. In this paper, we consider an SA involving a contraction mapping with respect to an arbitrary norm, and show its finite-sample error bounds while using different stepsizes. The idea is to construct a smooth Lyapunov function using the generalized Moreau envelope, and show that the iterates of SA have negative drift with respect to that Lyapunov function. Our result is applicable in Reinforcement Learning (RL). In particular, we use it to establish the first-known convergence rate of the V-trace algorithm for off-policy TD-learning [18]. Importantly, our construction results in only a logarithmic dependence of the convergence bound on the size of the state-space.

contractive stochastic approximation, finite-sample analysis, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback

Finite-Sample Analysis of Fixed-k Nearest Neighbor Density Functional Estimators

Neural Information Processing SystemsNov-21-2025, 14:28:07 GMT

We provide finite-sample analysis of a general framework for using k-nearest neighbor statistics to estimate functionals of a nonparametric continuous probability density, including entropies and divergences. Rather than plugging a consistent density estimate (which requires k as the sample size n) into the functional of interest, the estimators we consider fix k and perform a bias correction. This can be more efficient computationally, and, as we show, statistically, leading to faster convergence rates. Our framework unifies several previous estimators, for most of which ours are the first finite sample guarantees.

finite-sample analysis, name change, nearest neighbor density functional estimator, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.73)

Add feedback

A Block Coordinate Ascent Algorithm for Mean-Variance Optimization

Tengyang Xie, Bo Liu, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, Daesub Yoon

Neural Information Processing SystemsNov-20-2025, 16:18:21 GMT

Risk management in dynamic decision problems is a primary concern in many fields, including financial investment, autonomous driving, and healthcare.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Information Technology (0.54)
Health & Medicine (0.48)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Shaofeng Zou, Tengyu Xu, Yingbin Liang

Neural Information Processing SystemsOct-3-2025, 08:27:49 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.

algorithm, behavior policy, sarsa algorithm, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Neural Information Processing SystemsAug-16-2025, 23:40:01 GMT

It is known that policy evaluation has the interpretation of solving a generalized Bellman equation. In this paper, we derive finite-sample bounds for any general off-policy TD-like stochastic approximation algorithm that solves for the fixed-point of this generalized Bellman operator.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reviews: Finite-Sample Analysis for SARSA with Linear Function Approximation

Neural Information Processing SystemsJan-26-2025, 03:01:55 GMT

This paper deals with an important problem in theoretical reinforcement learning (RL), that is, finite-time analysis of on-policy RL algorithms such as SARSA. If the analysis techniques, as well as proofs, were correct and concrete, this work may have a broad impact on analyzing related stochastic approximation/RL algorithms. Although important and interesting, the present submission contains several major concerns, that have limited the contributions and even brought into question the practical usefulness of the reported theoretical results. These concerns are listed as follows. To facilitate analysis, a number of the assumptions adopted in this work are strong and impractical.

linear function approximation, machine learning, reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.45)

Add feedback

Reviews: Finite-Sample Analysis for SARSA with Linear Function Approximation

Neural Information Processing SystemsJan-26-2025, 03:01:45 GMT

Because the initial reviews were mixed, I obtained an additional review from an expert in the area of this paper. This 4th review came back clearly positive, but in the mean time one of the positive reviewers changed to negative (and later one of the negatives turned to positive). Then we had a lot of discussion, but the reviewers never did agree on how best to view this paper. In fact, they seemed to talk past each other, and in the end we had two positive and two negative reviews. As the area chair, reading the reviews and listening to the discussion, I found the 4th, very-positive review to be the most compelling.

finite-sample analysis, important contribution, linear function approximation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.40)

Add feedback