Latent Diffusion for Language Generation
Lovelace, Justin, Kishore, Varsha, Wan, Chao, Shekhtman, Eliot, Weinberger, Kilian Q.
–arXiv.org Artificial Intelligence
Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models.
arXiv.org Artificial Intelligence
Nov-7-2023
- Country:
- Africa > Middle East (0.04)
- South America > Brazil (0.04)
- Oceania > Australia (0.04)
- North America
- Cuba (0.04)
- United States
- Maryland > Baltimore (0.04)
- Kansas (0.04)
- New York
- New York County > New York City (0.14)
- Tompkins County > Ithaca (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Florida
- Orange County (0.04)
- Pasco County > Holiday (0.04)
- Hillsborough County > Tampa (0.04)
- California > San Francisco County
- San Francisco (0.04)
- Canada > Ontario
- Toronto (0.04)
- Europe
- Middle East (0.04)
- France (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- Spain > Galicia
- Madrid (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- United Kingdom > England
- Leicestershire > Leicester (0.04)
- Greater London > London (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Russia (0.14)
- Middle East > Iraq (0.14)
- North Korea (0.04)
- Bangladesh (0.04)
- Afghanistan (0.04)
- India > NCT
- New Delhi (0.04)
- Genre:
- Research Report (0.82)
- Industry:
- Leisure & Entertainment > Sports (1.00)
- Law (1.00)
- Energy (1.00)
- Media (0.68)
- Health & Medicine > Therapeutic Area (0.67)
- Government > Regional Government
- Technology: