Soft Mixture Denoising: Beyond the Expressive Bottleneck of Diffusion Models

Li, Yangming, van Breugel, Boris, van der Schaar, Mihaela

Jan-18-2024–arXiv.org Artificial Intelligence

Because diffusion models have shown impressive performances in a number of tasks, such as image synthesis, there is a trend in recent works to prove (with certain assumptions) that these models have strong approximation capabilities. In this paper, we show that current diffusion models actually have an expressive bottleneck in backward denoising and some assumption made by existing theoretical guarantees is too strong. Based on this finding, we prove that diffusion models have unbounded errors in both local and global denoising. In light of our theoretical studies, we introduce soft mixture denoising (SMD), an expressive and efficient model for backward denoising. SMD not only permits diffusion models to well approximate any Gaussian mixture distributions in theory, but also is simple and efficient for implementation. Our experiments on multiple image datasets show that SMD significantly improves different types of diffusion models (e.g., DDPM), espeically in the situation of few backward iterations. Diffusion models (DMs) (Sohl-Dickstein et al., 2015) have become highly popular generative models for their impressive performance in many research domains--including high-resolution image synthesis (Dhariwal & Nichol, 2021), natural language generation (Li et al., 2022), speech processing (Kong et al., 2021), and medical image analysis (Pinaya et al., 2022). To explain the effectiveness of diffusion models, recent work (Lee et al., 2022a;b; Chen et al., 2023) provided theoretical guarantees (with certain assumptions) to show that diffusion models can approximate a rich family of data distributions with arbitrarily small errors.

diffusion model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Jan-18-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - France (0.14)
  - Germany (0.14)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Generation (0.88)