Towards Data-Driven Offline Simulations for Online Reinforcement Learning