Fast and Furious Learning in Zero-Sum Games: Vanishing Regret with Non-Vanishing Step Sizes

Oct-9-2024, 19:44:28 GMT–Neural Information Processing Systems

We show for the first time that it is possible to reconcile in online learning in zero-sum games two seemingly contradictory objectives: vanishing time-average regret and non-vanishing step sizes. This phenomenon, that we coin fast and furious" learning in games, sets a new benchmark about what is possible both in max-min optimization as well as in multi-agent systems. Our analysis does not depend on introducing a carefully tailored dynamic. Instead we focus on the most well studied online dynamic, gradient descent. Similarly, we focus on the simplest textbook class of games, two-agent two-strategy zero-sum games, such as Matching Pennies. Even for this simplest of benchmarks the best known bound for total regret, prior to our work, was the trivial one of O(T), which is immediately applicable even to a non-learning agent.

fast and furious learning, non-vanishing step size, vanishing regret, (3 more...)

Neural Information Processing Systems

Oct-9-2024, 19:44:28 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence > Representation & Reasoning
    - Agents (1.00)