Improving BERT for Symbolic Music Understanding Using Token Denoising and Pianoroll Prediction
–arXiv.org Artificial Intelligence
We propose a pre-trained BERT-like model for symbolic music understanding that achieves competitive performance across a wide range of downstream tasks. To achieve this target, we design two novel pre-training objectives, namely token correction and pianoroll prediction. First, we sample a portion of note tokens and corrupt them with a limited amount of noise, and then train the model to denoise the corrupted tokens; second, we also train the model to predict bar-level and local pianoroll-derived representations from the corrupted note tokens. We argue that these objectives guide the model to better learn specific musical knowledge such as pitch intervals. For evaluation, we propose a benchmark that incorporates 12 downstream tasks ranging from chord estimation to symbolic genre classification. Results confirm the effectiveness of the proposed pre-training objectives on downstream tasks.
arXiv.org Artificial Intelligence
Jul-8-2025
- Country:
- Asia (0.28)
- Genre:
- Research Report > New Finding (0.34)
- Industry:
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Technology: