Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Ben Eysenbach, Russ R. Salakhutdinov, Sergey Levine

Neural Information Processing Systems 

The history of learning for control has been an exciting back and forth between twobroad classes ofalgorithms: planning andreinforcement learning.