Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games Tao Liu
–Neural Information Processing Systems
It is shown that, under mild technical assumptions and the introduction of the suboptimality gap, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an ϵ-Nash Equilibrium (NE) within O(1/ϵ) iterations.
Neural Information Processing Systems
Feb-11-2025, 03:05:33 GMT