AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.75)

Neural Information Processing SystemsApr-25-2026, 02:58:14 GMT

AApproximate Target Maximum Welfare Minimum Relative Entropy Equilbiria We use a Minimum Relative Entropy (RME) (also known as minimum KL divergence) Pa (a)ln

This objective is similar to Maximum Entropy Correlated Equilibrium (MECE) [48], and the proofs here are similar to the framework set out there. A drawback of MECE is that it is not easy to determine the minimum p permissible. If we choose p that does not permit a valid solution, then the parameters will diverge. We can circumvent this problem by optimizing the distance to a target ˆ p. And µis for balancing the linear objective.

artificial intelligence, machine learning, payoff, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Neural Information Processing SystemsFeb-19-2026, 08:34:13 GMT

Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability

Self-play is a technique for machine learning in multi-agent systems where a learning algorithm learns by interacting with copies of itself.

artificial intelligence, equilibrium, machine learning, (18 more...)

Country:

North America > Canada > Alberta (0.14)
North America > United States > Texas (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Poker (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-7-2026, 22:43:54 GMT

24f420aa4c99642dbb9aae18b166bbbc-Supplemental-Conference.pdf

graph, payoff, relative entropy objective, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

arXiv.org Artificial IntelligenceNov-11-2025

From Average-Iterate to Last-Iterate Convergence in Games: A Reduction and Its Applications

Cai, Yang, Luo, Haipeng, Wei, Chen-Yu, Zheng, Weiqiang

The convergence of online learning algorithms in games under self-play is a fundamental question in game theory and machine learning. Among various notions of convergence, last-iterate convergence is particularly desirable, as it reflects the actual decisions made by the learners and captures the day-to-day behavior of the learning dynamics. While many algorithms are known to converge in the average-iterate, achieving last-iterate convergence typically requires considerably more effort in both the design and the analysis of the algorithm. Somewhat surprisingly, we show in this paper that for a large family of games, there exists a simple black-box reduction that transforms the average iterates of an uncoupled learning dynamics into the last iterates of a new uncoupled learning dynamics, thus also providing a reduction from last-iterate convergence to average-iterate convergence. Our reduction applies to games where each player's utility is linear in both their own strategy and the joint strategy of all opponents. This family includes two-player bimatrix games and generalizations such as multi-player polymatrix games. By applying our reduction to the Optimistic Multiplicative Weights Update algorithm, we obtain new state-of-the-art last-iterate convergence rates for uncoupled learning dynamics in multi-player zero-sum polymatrix games: (1) an $O(\frac{\log d}{T})$ last-iterate convergence rate under gradient feedback, representing an exponential improvement in the dependence on the dimension $d$ (i.e., the maximum number of actions available to either player); and (2) an $\widetilde{O}(d^{\frac{1}{5}} T^{-\frac{1}{5}})$ last-iterate convergence rate under bandit feedback, improving upon the previous best rates of $\widetilde{O}(\sqrt{d} T^{-\frac{1}{8}})$ and $\widetilde{O}(\sqrt{d} T^{-\frac{1}{6}})$.

artificial intelligence, game theory, machine learning, (13 more...)

2506.03464

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Neural Information Processing SystemsOct-10-2025, 23:12:32 GMT

40386e4770bebd63fdf47cbc67341c0b-Paper-Conference.pdf

artificial intelligence, equilibrium, machine learning, (19 more...)

Country:

North America > Canada > Alberta (0.14)
North America > United States > Texas (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Poker (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kara, Semih, Başar, Tamer

Aggregate Fictitious Play for Learning in Anonymous Polymatrix Games (Extended Version)

arXiv.org Artificial IntelligenceAug-28-2025

Fictitious play (FP) is a well-studied algorithm that enables agents to learn Nash equilibrium in games with certain reward structures. However, when agents have no prior knowledge of the reward functions, FP faces a major challenge: the joint action space grows exponentially with the number of agents, which slows down reward exploration. Anonymous games offer a structure that mitigates this issue. In these games, the rewards depend only on the actions taken; not on who is taking which action. Under such a structure, we introduce aggregate fictitious play (agg-FP), a variant of FP where each agent tracks the frequency of the number of other agents playing each action, rather than these agents' individual actions. We show that in anonymous polymatrix games, agg-FP converges to a Nash equilibrium under the same conditions as classical FP. In essence, by aggregating the agents' actions, we reduce the action space without losing the convergence guarantees. Using simulations, we provide empirical evidence on how this reduction accelerates convergence.

agent, artificial intelligence, machine learning, (17 more...)

2508.19371

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.49)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Liu, Mingyang, Farina, Gabriele, Ozdaglar, Asuman

Differentially Private Equilibrium Finding in Polymatrix Games

arXiv.org Artificial IntelligenceMar-12-2025

We study equilibrium finding in polymatrix games under differential privacy constraints. To start, we show that high accuracy and asymptotically vanishing differential privacy budget (as the number of players goes to infinity) cannot be achieved simultaneously under either of the two settings: (i) We seek to establish equilibrium approximation guarantees in terms of Euclidean distance to the equilibrium set, and (ii) the adversary has access to all communication channels. Then, assuming the adversary has access to a constant number of communication channels, we develop a novel distributed algorithm that recovers strategies with simultaneously vanishing Nash gap (in expected utility, also referred to as exploitability and privacy budget as the number of players increases.

algorithm, equilibrium, polymatrix game, (12 more...)

2503.09538

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

MacQueen, Revan, Wright, James R.

Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability

arXiv.org Artificial IntelligenceNov-29-2023

Self-play is a technique for machine learning in multi-agent systems where a learning algorithm learns by interacting with copies of itself. Self-play is useful for generating large quantities of data for learning, but has the drawback that the agents the learner will face post-training may have dramatically different behavior than the learner came to expect by interacting with itself. For the special case of two-player constant-sum games, self-play that reaches Nash equilibrium is guaranteed to produce strategies that perform well against any post-training opponent; however, no such guarantee exists for multiplayer games. We show that in games that approximately decompose into a set of two-player constant-sum games (called constant-sum polymatrix games) where global $\epsilon$-Nash equilibria are boundedly far from Nash equilibria in each subgame (called subgame stability), any no-external-regret algorithm that learns by self-play will produce a strategy with bounded vulnerability. For the first time, our results identify a structural property of multiplayer games that enable performance guarantees for the strategies produced by a broad class of self-play algorithms. We demonstrate our findings through experiments on Leduc poker.

algorithm, equilibrium, strategy profile, (17 more...)