Data augmentation for efficient learning from parametric experts

Neural Information Processing Systems 

We present a simple, yet effective data-augmentation technique to enable data-efficient learning from parametric experts for reinforcement and imitation learning. We focus on what we call the policy cloning setting, in which we use online or of-fline queries of an expert or expert policy to inform the behavior of a student policy.