Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces

Oct-5-2024–arXiv.org Artificial Intelligence

The SS matrices are downsampled to 5 5. The results indicate that, compared to MusicGen, our method produces The new wave of generative models has been explored in the samples that more closely resemble the Pond5 samples literature to generate music. Jukebox [1] is based on Hierarchical in terms of long-term temporal consistency and the diversity VQ-VAEs [2] to generate multiple minutes of music. of recurring sections. Jukebox is one of the earliest purely learning-based models that could generate longer than one minute of music with some degree of structural coherence. Notably, the authors mention that the generated music at a small scale of multiple learn musical structures and forms at all scales. However, seconds is coherent, and at a larger scale, beyond one minute, none of the models in the literature has demonstrated musical it lacks musical form.

generative model, music, musicgen, (13 more...)

arXiv.org Artificial Intelligence

Oct-5-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States > California > San Diego County > San Diego (0.04)

Genre:
- Research Report (1.00)

Industry:
- Media > Music (1.00)
- Leisure & Entertainment (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.99)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found