Multi-agent online learning in time-varying games

Duvocelle, Benoit, Mertikopoulos, Panayotis, Staudigl, Mathias, Vermeulen, Dries

Sep-4-2021–arXiv.org Artificial Intelligence

We examine the long-run behavior of multi-agent online learning in games that evolve over time. Specifically, we focus on a wide class of policies based on mirror descent, and we show that the induced sequence of play (a) converges to Nash equilibrium in time-varying games that stabilize in the long run to a strictly monotone limit; and (b) it stays asymptotically close to the evolving equilibrium of the sequence of stage games (assuming they are strongly monotone). Our results apply to both gradient-based and payoff-based feedback - i.e., the "bandit feedback" case where players only get to observe the payoffs of their chosen actions.

artificial intelligence, equilibrium, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Sep-4-2021

arXiv.org PDF

Add feedback

Country:
- Europe (0.93)
- North America > United States (0.67)

Genre:
- Research Report > New Finding (0.34)

Industry:
- Education > Educational Setting
  - Online (0.61)
- Leisure & Entertainment > Games (0.68)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning > Agents (1.00)
  - Game Theory (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found