Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games
–Neural Information Processing Systems
It is shown that, under mild technical assumptions and the introduction of the \textit{suboptimality gap}, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an \epsilon -Nash Equilibrium (NE) within \mathcal{O}(1/\epsilon) iterations. This improves upon the previous best result of \mathcal{O}(1/\epsilon 2) iterations and is of the same order, \mathcal{O}(1/\epsilon), that is achievable for the single-agent case. Empirical results for a synthetic potential game and a congestion game are presented to verify the theoretical bounds.
Neural Information Processing Systems
Jan-19-2025, 13:38:11 GMT
- Technology: