Constant Rate Schedule: Constant-Rate Distributional Change for Efficient Training and Sampling in Diffusion Models

Okada, Shuntaro, Doi, Kenji, Yoshihashi, Ryota, Kataoka, Hirokatsu, Tanaka, Tomohiro

arXiv.org Artificial Intelligence 

We propose a noise schedule that ensures a constant rate of change in the probability distribution of diffused data throughout the diffusion process. To obtain this noise schedule, we measure the rate of change in the probability distribution of the forward process and use it to determine the noise schedule before training diffusion models. The functional form of the noise schedule is automatically determined and tailored to each dataset and type of diffusion model. We evaluate the effectiveness of our noise schedule on unconditional and class-conditional image generation tasks using the LSUN (bedroom/church/cat/horse), ImageNet, and FFHQ datasets. Through extensive experiments, we confirmed that our noise schedule broadly improves the performance of the diffusion models regardless of the dataset, sampler, number of function evaluations, or type of diffusion model. Image generation is one of the most challenging tasks in computer vision, and a variety of deep generative models have been proposed. Generative adversarial networks (GANs) (Goodfellow et al., 2014) have long been the leading models for high-quality image generation. These generative models achieved success across a wide range of fields beyond image generation, such as audio (van den Oord et al., 2016; Kong et al., 2021) and 3D-point cloud generation (Yang et al., 2019). The performance of generative models is measured using three metrics: sampling speed, sample quality, and mode coverage (Xiao et al., 2022). Despite the extensive research conducted, satisfying these requirements simultaneously is challenging.