Model-Based Reward Shaping for Adversarial Inverse Reinforcement Learning in Stochastic Environments