On the convergence of policy gradient methods to Nash equilibria in general stochastic games

Open in new window