Review for NeurIPS paper: Pre-training via Paraphrasing

Neural Information Processing Systems 

This paper present a novel pretraining idea and demonstrates strong empirical results on a number of tasks. Right now the paper reads a bit like a system description and it would be good consider adding some ablation experiments to shed some light on the various design choices. This might meant that some of the tasks might need to be relegated to the appendix to create space for these additional ablation experiments. In the eyes of the AC some ablations would be more useful than the current enumeration of tasks. It would be also be good to think about alternative names for describing the MT setup.