PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher Sony AI CA, USA

May-28-2025, 18:23:58 GMT–Neural Information Processing Systems

The diffusion model performs remarkable in generating high-dimensional content but is computationally intensive, especially during training. We propose Progressive Growing of Diffusion Autoencoder (PaGoDA), a novel pipeline that reduces the training costs through three stages: training diffusion on downsampled data, distilling the pretrained diffusion, and progressive super-resolution. With the proposed pipeline, PaGoDA achieves a 64 reduced cost in training its diffusion model on 8 downsampled data; while at the inference, with the single-step, it performs state-of-the-art on ImageNet across all resolutions from 64 64 to 512 512, and text-to-image. PaGoDA's pipeline can be applied directly in the latent space, adding compression alongside the pre-trained autoencoder in Latent Diffusion Models (e.g., Stable Diffusion). The code is available at https://github.com/sony/pagoda.

artificial intelligence, machine learning, pagoda, (17 more...)

Neural Information Processing Systems

May-28-2025, 18:23:58 GMT

Conferences PDF

Add feedback

Country:
- Europe > Switzerland
  - Zürich > Zürich (0.14)
- North America > United States (0.50)

Genre:
- Research Report > Experimental Study (0.92)

Industry:
- Semiconductors & Electronics (0.61)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)