AITopics | xk 1

Collaborating Authors

xk 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Concentration of General Stochastic Approximation Under Heavy-Tailed Markovian Noise

Agrawal, Shubhada, Maguluri, Siva Theja, Zubeldia, Martin

arXiv.org Machine LearningMay-21-2026

We establish maximal concentration bounds for the iterates generated by stochastic approximation algorithms with general step sizes, where the noise has a finite-state Markovian component plus a Martingale-difference component. When the Martingale-difference noise is bounded, we show that the tail of the error can be sub-Gaussian, sub-Weibull, or something lighter than any Pareto but heavier than any Weibull, depending on the step size sequence and on whether the random operator is almost surely contractive, almost surely non-expansive, or expansive with positive probability. Our analysis relies on a novel Lyapunov function involving the moment-generating function of the solution to a Poisson equation, together with an auxiliary projected algorithm. We complement the upper bounds with worst-case examples showing that qualitatively sharper bounds are impossible. We further study the case of unbounded Martingale-difference noise when the average operator is contractive, and the step sizes are of order $1/k$. In this setting, we show that if the random operator is almost surely non-expansive, then the error tail is at most three times heavier than the noise tail, whereas if the random operator is expansive with positive probability, then the error may have substantially heavier tails. These results are obtained through a novel black-box truncation argument that reduces the unbounded-noise setting to the bounded-noise case.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

2605.20999

Country: Europe (0.27)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Block-Coordinate Methods and Restarting for Solving Extensive-Form Games

Neural Information Processing SystemsApr-25-2026, 21:26:16 GMT

Coordinate descent methods are popular in machine learning and optimization for their simple sparse updates and excellent practical performance. In the context of large-scale sequential game solving, these same properties would be attractive, but until now no such methods were known, because the strategy spaces do not satisfy the typical separable block structure exploited by such methods. We present the first cyclic coordinate-descent-like method for the polytope of sequence-form strategies, which form the strategy spaces for the players in an extensive-form game (EFG). Our method exploits the recursive structure of the proximal update induced by what are known as dilated regularizers, in order to allow for a pseudo block-wise update. We show that our method enjoys a O(1/T)convergence rate to a two-player zero-sum Nash equilibrium, while avoiding the worst-case polynomial scaling with the number of blocks common to cyclic methods. We empirically show that our algorithm usually performs better than other state-of-the-art first-order methods (i.e., mirror prox), and occasionally can even beat CFR+, a state-ofthe-art algorithm for numerical equilibrium computation in zero-sum EFGs. We then introduce a restarting heuristic for EFG solving. We show empirically that restarting can lead to speedups, sometimes huge, both for our cyclic method, as well as for existing methods such as mirror prox and predictive CFR+.

artificial intelligence, decision point, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Supplementary Materials AExpanded Related Work

Neural Information Processing SystemsApr-24-2026, 17:31:08 GMT

A number of gradient-based bilevel algorithms have been proposed via AIDand ITD-based hypergradient approximations. For example, AID-based hypergradient computation [4, 33, 10, 11, 19] estimates the Hessian-inverse-vector product by solving a linear system with an efficient iterative algorithm. ITD-based hypergradient computation [31, 8, 9, 6, 35, 17] involves a backpropagation over the inner-loop gradient-based optimization path. Convergence rate of AIDand ITD-based bilevel methods has been studied recently. For example, [10, 19] and [19, 17] analyzed the convergence rate and complexity of AIDand ITD-based bilevel algorithms, respectively.

artificial intelligence, conjunction, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Unbiased and Biased Variance-Reduced Forward-Reflected-Backward Splitting Methods for Stochastic Composite Inclusions

Tran-Dinh, Quoc, Nguyen-Trung, Nghia

arXiv.org Machine LearningMar-17-2026

This paper develops new variance-reduction techniques for the forward-reflected-backward splitting (FRBS) method to solve a class of possibly nonmonotone stochastic composite inclusions. Unlike unbiased estimators such as mini-batching, developing stochastic biased variants faces a fundamental technical challenge and has not been utilized before for inclusions and fixed-point problems. We fill this gap by designing a new framework that can handle both unbiased and biased estimators. Our main idea is to construct stochastic variance-reduced estimators for the forward-reflected direction and use them to perform iterate updates. First, we propose a class of unbiased variance-reduced estimators and show that increasing mini-batch SGD, loopless-SVRG, and SAGA estimators fall within this class. For these unbiased estimators, we establish a $\mathcal{O}(1/k)$ best-iterate convergence rate for the expected squared residual norm, together with almost-sure convergence of the iterate sequence to a solution. Consequently, we prove that the best oracle complexities for the $n$-finite-sum and expectation settings are $\mathcal{O}(n^{2/3}ε^{-2})$ and $\mathcal{O}(ε^{-10/3})$, respectively, when employing loopless-SVRG or SAGA, where $ε$ is a desired accuracy. Second, we introduce a new class of biased variance-reduced estimators for the forward-reflected direction, which includes SARAH, Hybrid SGD, and Hybrid SVRG as special instances. While the convergence rates remain valid for these biased estimators, the resulting oracle complexities are $\mathcal{O}(n^{3/4}ε^{-2})$ and $\mathcal{O}(ε^{-5})$ for the $n$-finite-sum and expectation settings, respectively. Finally, we conduct two numerical experiments on AUC optimization for imbalanced classification and policy evaluation in reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2603.15576

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates

Krishnakumar Balasubramanian, Saeed Ghadimi

Neural Information Processing SystemsFeb-12-2026, 15:06:13 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, optimization, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.04)
North America > United States > California > Yolo County > Davis (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

949b3011c50300a2b4e60377466f52a8-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 20:47:38 GMT

Eachofthebase neurons is labelled by a function from the base classH.

artificial intelligence, machine learning, xn1, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

CommunicationAccelerationofLocalGradient MethodsviaanAcceleratedPrimal-DualAlgorithm withInexactProx

Neural Information Processing SystemsFeb-10-2026, 13:30:13 GMT

However, all known theoretical rates are worse than the rate of gradient descent, which synchronizes after every gradient step.

artificial intelligence, arxivpreprintarxiv, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Russia (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

AcceleratedPrimal-DualGradientMethodforSmooth andConvex-ConcaveSaddle-PointProblemswith BilinearCoupling

Neural Information Processing SystemsFeb-10-2026, 13:15:14 GMT

A common approach to this problem is to use a linear approximationVπ(s)=ϕ(s) x,whereϕ(s)isafeaturevectorofastate s.

artificial intelligence, machine learning, xk 1, (16 more...)

Neural Information Processing Systems

Country:

Asia > Russia (0.14)
North America > United States (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Score-BasedDiffusionmeets AnnealedImportanceSampling

Neural Information Processing SystemsFeb-10-2026, 12:01:36 GMT

More than twenty years after its introduction, Annealed Importance Sampling (AIS) remains oneofthemost effectivemethods formarginal likelihood estimation.

artificial intelligence, machine learning, tk 1, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

c336346c777707e09cab2a3c79174d90-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 04:26:55 GMT

We also establish new convergence complexities to achieve an approximate KKT solution when the objective can be smooth/nonsmooth, deterministic/stochastic and convex/nonconvex with complexity that is on a par with gradient descent for unconstrained optimization problems in respective cases. To the best of our knowledge, this is the first study of the first-order methods with complexity guarantee for nonconvex sparse-constrained problems.

artificial intelligence, machine learning, xk 1, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback