AnImprovedAnalysisof(Variance-Reduced) Policy GradientandNaturalPolicyGradientMethods
–Neural Information Processing Systems
In this paper, we revisit and improve the convergence of policy gradient (PG), natural PG (NPG) methods, and their variance-reduced variants, under general smooth policy parametrizations.
Neural Information Processing Systems
Feb-8-2026, 11:46:44 GMT
- Country:
- Technology: