SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Han, Xiaochuang, Kumar, Sachin, Tsvetkov, Yulia
–arXiv.org Artificial Intelligence
Despite the growing success of diffusion models in continuous-valued domains (e.g., images), similar efforts for discrete domains such as text have yet to match the performance of autoregressive language models. In this work, we present SSD-LM -- a diffusion-based language model with two key design choices. First, SSD-LM is semi-autoregressive, iteratively generating blocks of text, allowing for flexible output length at decoding time while enabling local bidirectional context updates. Second, it is simplex-based, performing diffusion on the natural vocabulary space rather than a learned latent space, allowing us to incorporate classifier guidance and modular control using off-the-shelf classifiers without any adaptation. We evaluate SSD-LM on unconstrained text generation benchmarks, and show that it matches or outperforms strong autoregressive GPT-2 models across standard quality and diversity metrics, while vastly outperforming diffusion-based baselines. On controlled text generation, SSD-LM also outperforms competitive baselines, with an extra advantage in modularity.
arXiv.org Artificial Intelligence
Jun-26-2023
- Country:
- Asia
- Europe
- Denmark > Capital Region
- Copenhagen (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- Ukraine (0.04)
- Western Europe (0.04)
- Denmark > Capital Region
- North America
- Dominican Republic (0.04)
- United States
- California > San Diego County
- San Diego (0.04)
- New York > New York County
- New York City (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- California > San Diego County
- South America > Ecuador (0.04)
- Genre:
- Research Report (0.64)
- Industry:
- Government > Military (0.67)
- Information Technology > Security & Privacy (0.46)
- Technology: