A Novel Variational Lower Bound for Inverse Reinforcement Learning