AITopics | ne-gap

Collaborating Authors

ne-gap

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games

Neural Information Processing SystemsApr-24-2026, 13:11:09 GMT

Softmax policy gradient is a popular algorithm for policy optimization in singleagent reinforcement learning, particularly since projection is not needed for each gradient update. However, in multi-agent systems, the lack of central coordination introduces significant additional difficulties in the convergence analysis. Even for a stochastic game with identical interest, there can be multiple Nash Equilibria (NEs), which disables proof techniques that rely on the existence of a unique global optimum. Moreover, the softmax parameterization introduces non-NE policies with zero gradient, making it difficult for gradient-based algorithms in seeking NEs. In this paper, we study the finite time convergence of decentralized softmax gradient play in a special form of game, Markov Potential Games (MPGs), which includes the identical interest game as a special case. We investigate both gradient play and natural gradient play, with and without log-barrier regularization. The established convergence rates for the unregularized cases contain a trajectory dependent constant that can be arbitrarily large, whereas the log-barrier regularization overcomes this drawback, with the cost of slightly worse dependence on other factors such as the action set size. An empirical study on an identical interest matrix game confirms the theoretical findings.

artificial intelligence, gradient play, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America (0.45)
Oceania > Australia (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Supplementary Materials for " Multi-Agent Meta-Reinforcement Learning " AT echnical Lemmas

Neural Information Processing SystemsFeb-17-2026, 06:30:20 GMT

From the three-points identity of the Bregman divergence (Lemma 3.1 of [9]), KL (x y) KL ( x y) = KL (x x) + ln x ln y,x x (12) The first term in (12) can be bounded by KL (x x) = By the Hölder's inequality, the second term in (12) is bounded as ln x ln y,x x ln x ln y Lemma 5. Consider a block diagonal matrix We prove the lemma via induction on N . This completes the induction proof.Lemma 6. We introduce one more notation before presenting the proof. This leads us to the initialization-dependent convergence rate of Algorithm 1, which we re-state and prove as follows. In addition, if we initialize the players' policies to be uniform policies, i.e., The rest of the proof follows by putting all the aforementioned results together.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games Y oubang Sun

Neural Information Processing SystemsFeb-15-2026, 17:43:59 GMT

A major challenge in the analysis of multi-agent systems is the restriction on joint policies of agents.

artificial intelligence, ne-gap, potential game, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Brazos County > College Station (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

0cd4c8c7ba098b199242c6634f43f653-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 10:45:21 GMT

convergence, gradient play, natural gradient play, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.83)

Add feedback

0cd4c8c7ba098b199242c6634f43f653-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 10:45:17 GMT

gradient play, natural gradient play, regularization, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.93)

Add feedback

Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games Y oubang Sun

Neural Information Processing SystemsOct-9-2025, 00:40:53 GMT

A major challenge in the analysis of multi-agent systems is the restriction on joint policies of agents.

artificial intelligence, ne-gap, potential game, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Brazos County > College Station (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games

Neural Information Processing SystemsOct-2-2025, 04:12:57 GMT

The stochastic game (SG) is a classical multi-agent model that has received extensive attention in recent MARL studies.

artificial intelligence, gradient play, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.85)

Add feedback

Gradient play in stochastic games: stationary points, convergence, and sample complexity

Zhang, Runyu, Ren, Zhaolin, Li, Na

arXiv.org Artificial IntelligenceDec-6-2023

We study the performance of the gradient play algorithm for stochastic games (SGs), where each agent tries to maximize its own total discounted reward by making decisions independently based on current state information which is shared between agents. Policies are directly parameterized by the probability of choosing a certain action at a given state. We show that Nash equilibria (NEs) and first-order stationary policies are equivalent in this setting, and give a local convergence rate around strict NEs. Further, for a subclass of SGs called Markov potential games (which includes the setting with identical rewards as an important special case), we design a sample-based reinforcement learning algorithm and give a non-asymptotic global convergence rate analysis for both exact gradient play and our sample-based learning algorithm. Our result shows that the number of iterations to reach an $\epsilon$-NE scales linearly, instead of exponentially, with the number of agents. Local geometry and local stability are also considered, where we prove that strict NEs are local maxima of the total potential function and fully-mixed NEs are saddle points.

algorithm, convergence, gradient play, (14 more...)

arXiv.org Artificial Intelligence

2106.00198

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.65)

Add feedback

Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games

Sun, Youbang, Liu, Tao, Zhou, Ruida, Kumar, P. R., Shahrampour, Shahin

arXiv.org Artificial IntelligenceOct-27-2023

This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent reinforcement learning problem in Markov potential games. It is shown that, under mild technical assumptions and the introduction of the \textit{suboptimality gap}, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an $\epsilon$-Nash Equilibrium (NE) within $\mathcal{O}(1/\epsilon)$ iterations. This improves upon the previous best result of $\mathcal{O}(1/\epsilon^2)$ iterations and is of the same order, $\mathcal{O}(1/\epsilon)$, that is achievable for the single-agent case. Empirical results for a synthetic potential game and a congestion game are presented to verify the theoretical bounds.

lemma 3, ne-gap, potential game, (12 more...)

arXiv.org Artificial Intelligence

2310.09727

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Brazos County > College Station (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Industry: Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Add feedback

Filters

Collaborating Authors

ne-gap

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games

Supplementary Materials for " Multi-Agent Meta-Reinforcement Learning " AT echnical Lemmas

Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games Y oubang Sun

0cd4c8c7ba098b199242c6634f43f653-Supplemental-Conference.pdf

0cd4c8c7ba098b199242c6634f43f653-Paper-Conference.pdf

d1b1a091088904cbc7f7faa2b45c8f36-Supplemental-Conference.pdf

Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games Y oubang Sun

On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games

Gradient play in stochastic games: stationary points, convergence, and sample complexity

Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games