Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

Neural Information Processing Systems 

While this is a common approach in supervised learning, to our knowledge, this has not been discussed in detail in the offline RL setting.