What Exactly Does Guidance Do in Masked Discrete Diffusion Models
Ye, He, Kevin, Rojas, Molei, Tao
Diffusion models have become an influential tool for generative modeling, offering a flexible framework that performs well across a range of data types including images, audio, and text (Dhariwal and Nichol, 2021; Kong et al., 2021; Li et al., 2022; Ho et al., 2022). Originally formulated in continuous state spaces (Ho et al., 2020; Song et al., 2021), these models simulate a forward noising process--typically modeled by a stochastic differential equation and learn a reverse process to denoise and reconstruct the original data. More recently, the diffusion framework has been extended to discrete state spaces (Campbell et al., 2022; Lou et al., 2023), where the forward process is defined via a continuous-time Markov chain over a finite state space. This has enabled generative modeling for discrete domains such as language modeling, molecule generation, and protein design (Lou et al., 2023; Nie et al.; Huang et al., 2023; Gruver et al., 2023). A key innovation that has enhanced the performance and flexibility of diffusion models is guidance, which introduces an auxiliary parameter to steer the reverse process toward desired outputs. In the continuous setting, classifier guidance (Dhariwal and Nichol, 2021) and classifier-free guidance (Ho and Salimans, 2021; Nichol et al., 2022) are widely used for conditional generation based on class labels or text prompts, significantly improving sample quality and alignment with conditioning signals. This technique has been critical to the success of models such as GLIDE (Nichol et al., 2022) and Imagen (Saharia et al., 2022). Theoretical analyses of guided diffusion models in continuous state spaces have examined how guidance modifies the reverse dynamics, most of which focus on simple settings such as low-dimensional and mixture of Gaussian models (Bradley and Nakkiran, 2024; Wu et al., 2024; Chidambaram et al., 2024). Classifier-free guidance (CFG) has also been recently introduced to discrete diffusion models, for applications such as text generation and controlled molecule design (Huang et al., 2023; Nisonoff et al., 2024).
Jun-13-2025
- Country:
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Genre:
- Research Report > New Finding (0.46)