Robust Imitation of a Few Demonstrations with a Backwards Model

Neural Information Processing Systems 

By imitating both demonstrations and these model rollouts, the agent learns the demonstrated paths and how to get back onto these paths.