Music-to-Text Synaesthesia: Generating Descriptive Text from Music Recordings

Kuang, Zhihuan, Zong, Shi, Zhang, Jianbing, Chen, Jiajun, Liu, Hongfu

May-7-2023–arXiv.org Artificial Intelligence

In this paper, we consider a novel research problem: music-to-text synaesthesia. Different from the classical music tagging problem that classifies a music recording into pre-defined categories, music-to-text synaesthesia aims to generate descriptive texts from music recordings with the same sentiment for further understanding. As existing music-related datasets do not contain the semantic descriptions on music recordings, we collect a new dataset that contains 1,955 aligned pairs of classical music recordings and text descriptions. Based on this, we build a computational model to generate sentences that can describe the content of the music recording. To tackle the highly non-discriminative classical music, we design a group topology-preservation loss, which considers more samples as a group reference and preserves the relative topology among different samples. Extensive experimental results qualitatively and quantitatively demonstrate the effectiveness of our proposed model over five heuristics or pre-trained competitive methods and their variants on our collected dataset.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

May-7-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.82)

Industry:
- Leisure & Entertainment (1.00)
- Media > Music (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (1.00)
  - Natural Language (1.00)
  - Vision (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found