Ordering-based Conditions for Global Convergence of Policy Gradient Methods
–Neural Information Processing Systems
The conditions on the representation that imply global convergence are different between these two algorithms.
Neural Information Processing Systems
Feb-12-2026, 16:41:21 GMT