Review for NeurIPS paper: Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning

Neural Information Processing Systems 

Weaknesses: While I find this paper reasonably thorough, I'm skeptical of the novelty. It seems the two components that differentiate it from Dreamer come from this mutual information maximization objective, which is to maximize the policy entropy and minimize the model loss. While there is an ablation showing what happens if you remove the model loss component, there is no ablation showing what happens if you remove the entropy maximization. My assumption is that the core reason for improvement is the model loss, which is not a surprising result. Doing this ablation would address this concern.