Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement Benjamin Eysenbach

Neural Information Processing Systems 

Several prior works have found that relabeling past experience with different reward functions can improve sample efficiency.