Policy Gradient Algorithms Implicitly Optimize by Continuation