Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling

Dang, Meihua, Han, Jiaqi, Xu, Minkai, Xu, Kai, Srivastava, Akash, Ermon, Stefano

Oct-7-2025–arXiv.org Artificial Intelligence

Discrete diffusion models have recently emerged as strong alternatives to autoregressive language models, matching their performance through large-scale training. However, inference-time control remains relatively underexplored. In this work, we study how to steer generation toward desired rewards without retraining the models. Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement. We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity under reward optimization. PG-DLM constructs a Markov chain over full denoising trajectories and applies a conditional sequential Monte Carlo kernel to resample them. We derive theoretical guarantees for convergence, including asymptotic consistency and variance bounds. Within this framework, we further analyze trade-offs across four key axes for inference-time scaling under fixed budgets: iterations, samples, denoising steps, and reward estimation. Our analysis shows scaling iterations achieves the best reward-perplexity trade-off. Empirically, PG-DLM consistently outperforms prior methods using MDLM and LLaDA-8B as base models across a wide range of compute budgets for reward-guided generation tasks including toxicity and sentiment control as well as linguistic acceptability.

diffusion model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Oct-7-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.70)
  - Machine Learning > Learning Graphical Models
    - Directed Networks > Bayesian Learning (0.60)