Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct

Zheng, Haoyang, Liu, Xinyang, Kong, Cindy Xiangrui, Jiang, Nan, Hu, Zheyuan, Luo, Weijian, Deng, Wei, Lin, Guang

Oct-2-2025–arXiv.org Artificial Intelligence

Fast and high-quality language generation is the holy grail that people pursue in the age of AI. In this work, we introduce Discrete Diffusion Divergence Instruct (DiDi-Instruct), a training-based method that initializes from a pre-trained (masked) discrete diffusion language model (dLLM) and distills a few-step student for fast generation. The resulting DiDi-Instruct model achieves comparable or superior performance to its dLLM teacher and the GPT-2 baseline while enabling up to 64$\times$ acceleration. The theoretical foundation of DiDi-Instruct is a novel framework based on integral KL-divergence minimization, which yields a practical training algorithm. We further introduce grouped reward normalization, intermediate-state matching, and the reward-guided ancestral sampler that significantly improve training stability, model coverage, and inference quality. On OpenWebText, DiDi-Instruct achieves perplexity from 62.2 (8 NFEs) to 18.4 (128 NFEs), which outperforms prior accelerated dLLMs and GPT-2 baseline. These gains come with a negligible entropy loss (around $1\%$) and reduce additional training wall-clock time by more than $20\times$ compared to competing dLLM distillation methods. We further validate the robustness and effectiveness of DiDi-Instruct through extensive ablation studies, model scaling, and the generation of discrete protein sequences. In conclusion, DiDi-Instruct is an efficient yet effective distillation method, enabling language generation in the blink of an eye. We will release both code and models at github.com/haoyangzheng-ai/didi-instruct.

didi-instruct, large language model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

Oct-2-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)
- Europe (1.00)
- Asia
  - Middle East > Syria (1.00)
  - Russia (0.93)

Genre:
- Research Report > New Finding (1.00)
- Personal > Interview (0.92)

Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Information Technology (1.00)
- Law (1.00)
- Education (1.00)
- Energy > Power Industry (0.94)
- Media > News (0.67)
- Leisure & Entertainment > Games
  - Computer Games (0.67)
- Health & Medicine
  - Therapeutic Area (0.93)
  - Pharmaceuticals & Biotechnology (0.88)
- Government
  - Voting & Elections (1.00)
  - Military (1.00)
  - Foreign Policy (0.92)
  - Regional Government
    - North America Government > United States Government (1.00)
    - Europe Government > Russia Government (0.67)
    - Asia Government
      - Middle East Government > Syria Government (1.00)
      - Russia Government (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Generation (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found