Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces
–arXiv.org Artificial Intelligence
The SS matrices are downsampled to 5 5. The results indicate that, compared to MusicGen, our method produces The new wave of generative models has been explored in the samples that more closely resemble the Pond5 samples literature to generate music. Jukebox [1] is based on Hierarchical in terms of long-term temporal consistency and the diversity VQ-VAEs [2] to generate multiple minutes of music. of recurring sections. Jukebox is one of the earliest purely learning-based models that could generate longer than one minute of music with some degree of structural coherence. Notably, the authors mention that the generated music at a small scale of multiple learn musical structures and forms at all scales. However, seconds is coherent, and at a larger scale, beyond one minute, none of the models in the literature has demonstrated musical it lacks musical form.
arXiv.org Artificial Intelligence
Oct-5-2024
- Country:
- North America > United States > California > San Diego County > San Diego (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Leisure & Entertainment (1.00)
- Media > Music (1.00)
- Technology: