Learning to Execute: Efficiently Learning Universal Plan-Conditioned Policies in Robotics

Neural Information Processing Systems 

In this sense, the respective strengths and weaknesses of RL and model-based planners are complementary.