Multi-scale Generative Modeling for Fast Sampling

Xiao, Xiongye, Li, Shixuan, Huang, Luzhe, Liu, Gengshuo, Nguyen, Trung-Kien, Huang, Yi, Chang, Di, Kochenderfer, Mykel J., Bogdan, Paul

arXiv.org Artificial Intelligence 

While working within the spatial domain can pose problems associated with ill-conditioned scores caused by power-law decay, recent advances in diffusion-based generative models have shown that transitioning to the wavelet domain offers a promising alternative. However, within the wavelet domain, we encounter unique challenges, especially the sparse representation of high-frequency coefficients, which deviates significantly from the Gaussian assumptions in the diffusion process. To this end, we propose a multi-scale generative modeling in the wavelet domain that employs distinct strategies for handling low and high-frequency bands. In the wavelet domain, we apply score-based generative modeling with well-conditioned scores for low-frequency bands, while utilizing a multi-scale generative adversarial learning for high-frequency bands. As supported by the theoretical analysis and experimental results, our model significantly improve performance and reduce the number of trainable parameters, sampling steps, and time.