Multistep Distillation of Diffusion Models via Moment Matching
–Neural Information Processing Systems
We present a new method for making diffusion models faster to sample. The method distills many-step diffusion models into few-step models by matching conditional expectations of the clean data given noisy data along the sampling trajectory. Our approach extends recently proposed one-step methods to the multistep case, and provides a new perspective by interpreting these approaches in terms of moment matching. By using up to 8 sampling steps, we obtain distilled models that outperform not only their one-step versions but also their original many-step teacher models, obtaining new state-of-the-art results on the Imagenet dataset. We also show promising results on a large text-to-image model where we achieve fast generation of high resolution images directly in image space, without needing autoencoders or upsamplers. Figure 1: Selected 8-step samples from our distilled text-to-image model.
Neural Information Processing Systems
May-29-2025, 06:23:57 GMT
- Country:
- Europe (0.14)
- North America > United States (0.14)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education (0.46)
- Technology: