On the Convergence of Discounted Policy Gradient Methods

Open in new window