Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings