Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games

Neural Information Processing Systems 

It is shown that, under mild technical assumptions and the introduction of the suboptimality gap, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an ϵ-Nash Equilibrium (NE) within O(1/ϵ) iterations.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found