AITopics | ftrl

55563844bcd4bba067fe86ac1f008c7e-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 23:25:01 GMT

artificial intelligence, machine learning, zero-sum game, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)

Add feedback

No-regret Learning in Harmonic Games: Extrapolation in the Face of Conflicting Interests

Neural Information Processing SystemsMar-22-2026, 17:33:04 GMT

The long-run behavior of multi-agent online learning -- and, in particular, no-regret learning -- is relatively well-understood in potential games, where players have common interests. By contrast, in general harmonic games -- the strategic complement of potential games, where players have competing interests -- very little is known outside the narrow subclass of $2$-player zero-sum games with a fully-mixed equilibrium. Our paper seeks to partially fill this gap by focusing on the full class of (generalized) harmonic games and examining the convergence properties of follow-the-regularized-leader (FTRL), the most widely studied class of no-regret learning schemes. As a first result, we show that the continuous-time dynamics of FTRL are Poincaré recurrent, i.e., they return arbitrarily close to their starting point infinitely often, and hence fail to converge. In discrete time, the standard, vanilla implementation of FTRL may lead to even worse outcomes, eventually trapping the players in a perpetual cycle of best-responses. However, if FTRL is augmented with a suitable extrapolation step -- which includes as special cases the optimistic and mirror-prox variants of FTRL -- we show that learning converges to a Nash equilibrium from any initial condition, and all players are guaranteed at most $\mathcal{O}(1)$ regret. These results provide an in-depth understanding of no-regret learning in harmonic games, nesting prior work on $2$-player zero-sum games, and showing at a high level that potential and harmonic games are complementary not only from the strategic but also from the dynamic viewpoint.

artificial intelligence, game theory, proceedings, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Game Theory (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.59)

Add feedback

A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of \Theta(T {2/3}) and its Application to Best-of-Both-Worlds

Neural Information Processing SystemsMar-18-2026, 04:19:51 GMT

Follow-the-Regularized-Leader (FTRL) is a powerful framework for various online learning problems. By designing its regularizer and learning rate to be adaptive to past observations, FTRL is known to work adaptively to various properties of an underlying environment. However, most existing adaptive learning rates are for online learning problems with a minimax regret of $\Theta(\sqrt{T})$ for the number of rounds $T$, and there are only a few studies on adaptive learning rates for problems with a minimax regret of $\Theta(T^{2/3})$, which include several important problems dealing with indirect feedback. To address this limitation, we establish a new adaptive learning rate framework for problems with a minimax regret of $\Theta(T^{2/3})$. Our learning rate is designed by matching the stability, penalty, and bias terms that naturally appear in regret upper bounds for problems with a minimax regret of $\Theta(T^{2/3})$. As applications of this framework, we consider three major problems with a minimax regret of $\Theta(T^{2/3})$: partial monitoring, graph bandits, and multi-armed bandits with paid observations. We show that FTRL with our learning rate and the Tsallis entropy regularizer improves existing Best-of-Both-Worlds (BOBW) regret upper bounds, which achieve simultaneous optimality in the stochastic and adversarial regimes. The resulting learning rate is surprisingly simple compared to the existing learning rates for BOBW algorithms for problems with a minimax regret of $\Theta(T^{2/3})$.

artificial intelligence, machine learning, minimax regret, (7 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

FastandFuriousLearninginZero-SumGames: VanishingRegretwithNon-VanishingStepSizes

Neural Information Processing SystemsFeb-11-2026, 21:22:18 GMT

This phenomenon, that we coin "fast and furious" learning in games, sets a new benchmark about what is possible both in max-min optimization as well as in multi-agent systems.

artificial intelligence, gradientdescent, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.98)

Add feedback

dcd2f3f312b6705fb06f4f9f1b55b55c-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 11:47:42 GMT

algorithm, nullnull, regularizer, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.41)

Add feedback

Adversarial

Neural Information Processing SystemsFeb-11-2026, 11:47:38 GMT

Quantile (and, more generally, KL) regret bounds, such as those achieved by NormalHedge (Chaudhuri, Freund, and Hsu 2009) and its variants, relax the goal of competing against the best individual expert to only competing against a majority of experts on adversarial data.

artificial intelligence, machine learning, regularizer, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback