TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure
–arXiv.org Artificial Intelligence
Hierarchical planning is a powerful approach to model long sequences structurally. Aside from considering hierarchies in the temporal structure of music, this paper explores an even more important aspect: concept hierarchy, which involves generating music ideas, transforming them, and ultimately organizing them--across musical time and space--into a complete composition. To this end, we introduce TOMI (Transforming and Organizing Music Ideas) as a novel approach in deep music generation and develop a TOMI-based model via instruction-tuned foundation LLM. Formally, we represent a multi-track composition process via a sparse, four-dimensional space characterized by clips (short audio or MIDI segments), sections (temporal positions), tracks (instrument layers), and transformations (elaboration methods). Our model is capable of generating multi-track electronic music with full-song structure, and we further integrate the TOMI-based model with the REAPER digital audio workstation, enabling interactive human-AI co-creation. Experimental results demonstrate that our approach produces higher-quality electronic music with stronger structural coherence compared to baselines.
arXiv.org Artificial Intelligence
Jul-1-2025
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America > United States
- Rhode Island (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > San Francisco County
- San Francisco (0.28)
- Europe
- Asia
- Oceania > Australia
- Genre:
- Research Report > New Finding (0.66)
- Industry:
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Technology: