Solving Zero-Sum Convex Markov Games
Kalogiannis, Fivos, Vlatakis-Gkaragkounis, Emmanouil-Vasileios, Gemp, Ian, Piliouras, Georgios
–arXiv.org Artificial Intelligence
We contribute the first provable guarantees of global convergence to Nash equilibria (NE) in two-player zero-sum convex Markov games (cMGs) by using independent policy gradient methods. Convex Markov games, recently defined by Gemp et al. (2024), extend Markov decision processes to multi-agent settings with preferences that are convex over occupancy measures, offering a broad framework for modeling generic strategic interactions. However, even the fundamental min-max case of cMGs presents significant challenges, including inherent nonconvexity, the absence of Bellman consistency, and the complexity of the infinite horizon. We follow a two-step approach. First, leveraging properties of hidden-convex--hidden-concave functions, we show that a simple nonconvex regularization transforms the min-max optimization problem into a nonconvex-proximal Polyak-Lojasiewicz (NC-pPL) objective. Crucially, this regularization can stabilize the iterates of independent policy gradient methods and ultimately lead them to converge to equilibria. Second, building on this reduction, we address the general constrained min-max problems under NC-pPL and two-sided pPL conditions, providing the first global convergence guarantees for stochastic nested and alternating gradient descent-ascent methods, which we believe may be of independent interest.
arXiv.org Artificial Intelligence
Jun-23-2025
- Country:
- North America > United States
- Texas (0.04)
- Wisconsin > Dane County
- Madison (0.14)
- New York > New York County
- New York City (0.04)
- Illinois > Cook County
- Chicago (0.04)
- California > San Diego County
- Europe
- Italy (0.04)
- Greece (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- Asia
- Middle East > Jordan (0.04)
- China (0.04)
- Japan > Honshū
- Chūgoku > Hiroshima Prefecture > Hiroshima (0.04)
- North America > United States
- Genre:
- Research Report (0.81)
- Overview (0.67)
- Industry:
- Leisure & Entertainment > Games (1.00)
- Energy (0.76)
- Technology:
- Information Technology > Artificial Intelligence
- Robots (1.00)
- Representation & Reasoning
- Optimization (1.00)
- Agents (1.00)
- Machine Learning
- Reinforcement Learning (1.00)
- Statistical Learning > Gradient Descent (0.49)
- Neural Networks > Deep Learning (0.45)
- Information Technology > Artificial Intelligence