Recovering from Out-of-sample States via Inverse Dynamics in Offline Reinforcement Learning College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics