AITopics | time-varying game

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

Neural Information Processing SystemsDec-24-2025, 13:47:25 GMT

Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times. Our results also apply to time-varying general-sum multi-player games via a bilinear formulation of correlated equilibria, which has novel implications for meta-learning and for obtaining refined variation-dependent regret bounds, addressing questions left open in prior papers. Finally, we leverage our framework to also provide new insights on dynamic regret guarantees in static games.

convergence, name change, no-regret learning dynamic, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

Neural Information Processing SystemsOct-11-2024, 03:21:14 GMT

Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times.

convergence, no-regret learning dynamic, time-varying game, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

Anagnostides, Ioannis, Panageas, Ioannis, Farina, Gabriele, Sandholm, Tuomas

arXiv.org Artificial IntelligenceOct-18-2023

Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times. Our results also apply to time-varying general-sum multi-player games via a bilinear formulation of correlated equilibria, which has novel implications for meta-learning and for obtaining refined variation-dependent regret bounds, addressing questions left open in prior papers. Finally, we leverage our framework to also provide new insights on dynamic regret guarantees in static games.

convergence, no-regret learning dynamic, time-varying game

arXiv.org Artificial Intelligence

2301.11241

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Convergence Analysis of the Best Response Algorithm for Time-Varying Games

Wang, Zifan, Shen, Yi, Zavlanos, Michael M., Johansson, Karl H.

arXiv.org Artificial IntelligenceSep-1-2023

This paper studies a class of strongly monotone games involving non-cooperative agents that optimize their own time-varying cost functions. We assume that the agents can observe other agents' historical actions and choose actions that best respond to other agents' previous actions; we call this a best response scheme. We start by analyzing the convergence rate of this best response scheme for standard time-invariant games. Specifically, we provide a sufficient condition on the strong monotonicity parameter of the time-invariant games under which the proposed best response algorithm achieves exponential convergence to the static Nash equilibrium. We further illustrate that this best response algorithm may oscillate when the proposed sufficient condition fails to hold, which indicates that this condition is tight. Next, we analyze this best response algorithm for time-varying games where the cost functions of each agent change over time. Under similar conditions as for time-invariant games, we show that the proposed best response algorithm stays asymptotically close to the evolving equilibrium. We do so by analyzing both the equilibrium tracking error and the dynamic regret. Numerical experiments on economic market problems are presented to validate our analysis.

best response algorithm, convergence analysis, time-varying game

arXiv.org Artificial Intelligence

2309.00307

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.53)

Add feedback

Multi-agent online learning in time-varying games

Duvocelle, Benoit, Mertikopoulos, Panayotis, Staudigl, Mathias, Vermeulen, Dries

arXiv.org Artificial IntelligenceSep-4-2021

We examine the long-run behavior of multi-agent online learning in games that evolve over time. Specifically, we focus on a wide class of policies based on mirror descent, and we show that the induced sequence of play (a) converges to Nash equilibrium in time-varying games that stabilize in the long run to a strictly monotone limit; and (b) it stays asymptotically close to the evolving equilibrium of the sequence of stage games (assuming they are strongly monotone). Our results apply to both gradient-based and payoff-based feedback - i.e., the "bandit feedback" case where players only get to observe the payoffs of their chosen actions.

artificial intelligence, equilibrium, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1287/moor.2022.1283

1809.03066

Country: