Weighted model estimation for offline model-based reinforcement learning