Gradient Informed Proximal Policy Optimization

Neural Information Processing Systems 

We introduce a novel policy learning method that integrates analytical gradients from differentiable environments with the Proximal Policy Optimization (PPO) algorithm.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found