Evolved Policy Gradients
Rein Houthooft, Yuhua Chen, Phillip Isola, Bradly Stadie, Filip Wolski, OpenAI Jonathan Ho, Pieter Abbeel
–Neural Information Processing Systems
The idea is to evolve a differentiable loss function, such thatanagent, which optimizes itspolicytominimize thisloss, willachieve highrewards.
Neural Information Processing Systems
Feb-13-2026, 08:18:27 GMT