JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models

Li, Peike, Chen, Boyu, Yao, Yao, Wang, Yikai, Wang, Allen, Wang, Alex

arXiv.org Artificial Intelligence 

Music generation has attracted growing interest with the advancement of deep generative models. However, generating music conditioned on textual descriptions, known as text-to-music, remains challenging due to the complexity of musical structures and high sampling rate requirements. This paper introduces JEN-1, a universal high-fidelity model for text-to-music generation. JEN-1 is a diffusion model incorporating both autoregressive and non-autoregressive training. Through incontext learning, JEN-1 performs various generation tasks including text-guided music generation, music inpainting, and continuation. Evaluations demonstrate JEN-1's superior performance over state-of-the-art methods in text-music alignment and music quality while maintaining computational efficiency. Our demos are available at https://www.futureverse.com/research/jen/ "Music is the universal language of mankind." - Henry Wadsworth Longfellow Music, as an artistic expression comprising harmony, melody and rhythm, holds great cultural significance and appeal to humans. Recent years have witnessed remarkable progress in music generation with the rise of deep generative models (Liu et al., 2023; Kreuk et al., 2022; Agostinelli et al., 2023).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found