Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement Benjamin Eysenbach

Neural Information Processing Systems 

Several prior works have found that relabeling past experience with different reward functions can improve sample efficiency.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found