MOPO: Model-based Offline Policy Optimization

Neural Information Processing Systems 

However, standard model-based RL methods, designed for the online setting, do not provide an explicit mechanism to avoid the offline setting's distributional shift issue.