Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch

Neural Information Processing Systems 

There is often a mismatch between the learner and the expert's