Mo\^usai: Text-to-Music Generation with Long-Context Latent Diffusion
Schneider, Flavio, Kamal, Ojasv, Jin, Zhijing, Schölkopf, Bernhard
–arXiv.org Artificial Intelligence
Recent years have seen the rapid development of large generative models for text; however, much less research has explored the connection between text and another "language" of communication -- music. Music, much like text, can convey emotions, stories, and ideas, and has its own unique structure and syntax. In our work, we bridge text and music via a text-to-music generation model that is highly efficient, expressive, and can handle long-term structure. Specifically, we develop Mo\^usai, a cascading two-stage latent diffusion model that can generate multiple minutes of high-quality stereo music at 48kHz from textual descriptions. Moreover, our model features high efficiency, which enables real-time inference on a single consumer GPU with a reasonable speed. Through experiments and property analyses, we show our model's competence over a variety of criteria compared with existing music generation models. Lastly, to promote the open-source culture, we provide a collection of open-source libraries with the hope of facilitating future work in the field. We open-source the following: Codes: https://github.com/archinetai/audio-diffusion-pytorch; music samples for this paper: http://bit.ly/44ozWDH; all music samples for all models: https://bit.ly/audio-diffusion.
arXiv.org Artificial Intelligence
Oct-23-2023
- Country:
- Asia
- India > West Bengal
- Kharagpur (0.04)
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- South Korea (0.04)
- India > West Bengal
- Europe
- Austria (0.04)
- France (0.04)
- Germany
- Baden-Württemberg > Tübingen Region
- Tübingen (0.04)
- Bavaria > Upper Bavaria
- Munich (0.04)
- Baden-Württemberg > Tübingen Region
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- United States
- California > Santa Clara County
- Sunnyvale (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Maryland > Baltimore (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- California > Santa Clara County
- Canada
- South America > Chile
- Asia
- Genre:
- Research Report (0.50)
- Industry:
- Leisure & Entertainment (1.00)
- Media > Music (1.00)
- Technology: