Strategizing against No-regret Learners

Deng, Yuan, Schneider, Jon, Sivan, Balusubramanian

Nov-12-2025–arXiv.org Artificial Intelligence

How should a player who repeatedly plays a game against a no-regret learner strategize to maximize his utility? We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium of the game. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We provide a characterization of the optimal game-play for the player against a mean-based no-regret learner as a solution to a control problem. When the no-regret learner's strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility.

artificial intelligence, learner, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Nov-12-2025

arXiv.org PDF

Add feedback

Country:
- North America (0.28)

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Games (0.88)

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence
    - Machine Learning (0.70)
    - Representation & Reasoning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found