diffusion-lm
- South America > Brazil (0.04)
- North America > United States > California (0.04)
- Consumer Products & Services > Restaurants (1.00)
- Leisure & Entertainment > Sports (0.68)
Towards Latent Diffusion Suitable For Text
Midavaine, Nesta, Naesseth, Christian A., Bartosh, Grigory
Language diffusion models aim to improve sampling speed and coherence over autoregressive LLMs. We introduce Neural Flow Diffusion Models for language generation, an extension of NFDM that enables the straightforward application of continuous diffusion models to discrete state spaces. NFDM learns a multivariate forward process from the data, ensuring that the forward process and generative trajectory are a good fit for language modeling. Our model substantially reduces the likelihood gap with autoregressive models of the same size, while achieving sample quality comparable to that of previous latent diffusion models.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
Diffusion-LM Improves Controllable Text Generation
Controlling the behavior of language models (LMs) without re-training is a major open problem in natural language generation. While recent works have demonstrated successes on controlling simple sentence attributes (e.g., sentiment), there has been little progress on complex, fine-grained controls (e.g., syntactic structure). To address this challenge, we develop a new non-autoregressive language model based on continuous diffusions that we call Diffusion-LM.
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Asia > China > Hong Kong (0.04)
- (7 more...)
- Leisure & Entertainment (0.93)
- Consumer Products & Services > Restaurants (0.68)
- South America > Brazil (0.04)
- North America > United States > California (0.04)
- Consumer Products & Services > Restaurants (1.00)
- Leisure & Entertainment > Sports (0.68)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Asia > China > Hong Kong (0.04)
- (7 more...)
- Leisure & Entertainment (0.93)
- Consumer Products & Services > Restaurants (0.68)
Diffusion-LM Improves Controllable Text Generation
Controlling the behavior of language models (LMs) without re-training is a major open problem in natural language generation. While recent works have demonstrated successes on controlling simple sentence attributes (e.g., sentiment), there has been little progress on complex, fine-grained controls (e.g., syntactic structure). To address this challenge, we develop a new non-autoregressive language model based on continuous diffusions that we call Diffusion-LM. The continuous, hierarchical nature of these intermediate variables enables a simple gradient-based algorithm to perform complex, controllable generation tasks. We demonstrate successful control of Diffusion-LM for six challenging fine-grained control tasks, significantly outperforming prior work.
Diffusion-LM Improves Controllable Text Generation
Controlling the behavior of language models (LMs) without re-training is a major open problem in natural language generation. While recent works have demonstrated successes on controlling simple sentence attributes (e.g., sentiment), there has been little progress on complex, fine-grained controls (e.g., syntactic structure). To address this challenge, we develop a new non-autoregressive language model based on continuous diffusions that we call Diffusion-LM. The continuous, hierarchical nature of these intermediate variables enables a simple gradient-based algorithm to perform complex, controllable generation tasks. We demonstrate successful control of Diffusion-LM for six challenging fine-grained control tasks, significantly outperforming prior work.
Generative Design of inorganic compounds using deep diffusion language models
Dong, Rongzhi, Fu, Nihang, Siriwardane, dirisuriya M. D., Hu, Jianjun
Discovering novel synthesizable and stable materials is of fundamental importance to our society. However, chemical innovation is nontrivial. The material composition and structure must satisfy many stringent constraints such as charge neutrality, balanced electronegativity, synthesizability, geometric symmetry, and mechanical stability. Historically, new material discovery relies on expert heuristics and usually is based on the tinkering of existing materials. Several structure generation studies [1, 2] have used brute-force element substitution to generate new structures based on known prototypes. However, the limitation of this permutation-based approach is that it cannot generate new formula prototypes, it can only employ known formulas as templates, facilitating the generation of novel compositions solely through the substitution of elements. With the development of crystal structure prediction algorithms such as CSMPL [3], TCSP [4], and ParetoCSP [5], the generation of chemically stable compositions has emerged as an increasingly critical challenge. Stable compositions play a pivotal role in mitigating the computational demands associated with subsequent stages of analysis.
- North America > United States > South Carolina > Richland County > Columbia (0.14)
- Europe > Austria > Vienna (0.04)
- Asia > Sri Lanka (0.04)
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Han, Xiaochuang, Kumar, Sachin, Tsvetkov, Yulia
Despite the growing success of diffusion models in continuous-valued domains (e.g., images), similar efforts for discrete domains such as text have yet to match the performance of autoregressive language models. In this work, we present SSD-LM -- a diffusion-based language model with two key design choices. First, SSD-LM is semi-autoregressive, iteratively generating blocks of text, allowing for flexible output length at decoding time while enabling local bidirectional context updates. Second, it is simplex-based, performing diffusion on the natural vocabulary space rather than a learned latent space, allowing us to incorporate classifier guidance and modular control using off-the-shelf classifiers without any adaptation. We evaluate SSD-LM on unconstrained text generation benchmarks, and show that it matches or outperforms strong autoregressive GPT-2 models across standard quality and diversity metrics, while vastly outperforming diffusion-based baselines. On controlled text generation, SSD-LM also outperforms competitive baselines, with an extra advantage in modularity.
- Asia > Russia (0.14)
- Europe > Ukraine (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- (9 more...)
- Government > Military (0.67)
- Information Technology > Security & Privacy (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)