PolicyPolicyUpdates

Neural Information Processing Systems 

We tackle this planning issue by extending the policy gradient theory to policy updates with respecttoanystatedensity.