4c5bcfec8584af0d967f1ab10179ca4b-AuthorFeedback.pdf

Neural Information Processing Systems 

Reviewers generally asked for more discussion on the relationship to other models (e.g. We'll discuss this, and we'd like to improve this in future work. NCSN's sampler coefficients are set by hand post-hoc, and their training procedure is not guaranteed to directly Experimental details: see Appendix B. Like GPUs), our CIFAR model trains at 21 steps/sec at batch size 128 (10.6 hours to train to completion at 800k steps), and We'd like to investigate how existing MCMC theory on this topic applies to our models. In contrast, our model is trained on a simple, stable non-adversarial MSE loss. "Is the diffusion setup key to the improvement?"