On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

Open in new window