Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings

Open in new window