gtsinger
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
The scarcity of high-quality and multi-task singing datasets significantly hinders the development of diverse controllable and personalized singing tasks, as existing singing datasets suffer from low quality, limited diversity of languages and singers, absence of multi-technique information and realistic music scores, and poor task suitability.To tackle these problems, we present GTSinger, a large Global, multi-Technique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks, along with its benchmarks.Particularly,(1) we collect 80.59 hours of high-quality singing voices, forming the largest recorded singing dataset;(2) 20 professional singers across nine widely spoken languages offer diverse timbres and styles;(3) we provide controlled comparison and phoneme-level annotations of six commonly used singing techniques, helping technique modeling and control;(4) GTSinger offers realistic music scores, assisting real-world musical composition;(5) singing voices are accompanied by manual phoneme-to-audio alignments, global style labels, and 16.16 hours of paired speech for various singing tasks.Moreover, to facilitate the use of GTSinger, we conduct four benchmark experiments: technique-controllable singing voice synthesis, technique recognition, style transfer, and speech-to-singing conversion.
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks Y u Zhang
To tackle these problems, we present GTSinger, a large G lobal, multi-T echnique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks, along with its benchmarks. Particularly, (1) we collect 80.59 hours of high-quality singing voices, forming the largest recorded singing dataset; (2) 20 professional singers across nine widely spoken languages offer diverse timbres and styles; (3) we provide controlled comparison and phoneme-level annotations of six commonly used singing techniques, helping technique modeling and control; (4) GTSinger offers realistic music scores, assisting real-world musical composition; (5) singing
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks Y u Zhang
To tackle these problems, we present GTSinger, a large G lobal, multi-T echnique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks, along with its benchmarks. Particularly, (1) we collect 80.59 hours of high-quality singing voices, forming the largest recorded singing dataset; (2) 20 professional singers across nine widely spoken languages offer diverse timbres and styles; (3) we provide controlled comparison and phoneme-level annotations of six commonly used singing techniques, helping technique modeling and control; (4) GTSinger offers realistic music scores, assisting real-world musical composition; (5) singing
SLEEPING-DISCO 9M: A large-scale pre-training dataset for generative music modeling
Ahmed, Tawsif, Radonjic, Andrej, Rabby, Gollam
We present Sleeping-DISCO 9M, a large-scale pre-training dataset for music and song. To the best of our knowledge, there are no open-source high-quality dataset representing popular and well-known songs for generative music modeling tasks such as text-music, music-captioning, singing-voice synthesis, melody reconstruction and cross-model retrieval. Past contributions focused on isolated and constrained factors whose core perspective was to create synthetic or re-recorded music corpus (e.g. GTSinger, M4Singer) and arbitrarily large-scale audio datasets (e.g. DISCO-10M and LAIONDISCO-12M) had been another focus for the community. Unfortunately, adoption of these datasets has been below substantial in the generative music community as these datasets fail to reflect real-world music and its flavour. Our dataset changes this narrative and provides a dataset that is constructed using actual popular music and world-renowned artists.
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
The scarcity of high-quality and multi-task singing datasets significantly hinders the development of diverse controllable and personalized singing tasks, as existing singing datasets suffer from low quality, limited diversity of languages and singers, absence of multi-technique information and realistic music scores, and poor task suitability.To tackle these problems, we present GTSinger, a large Global, multi-Technique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks, along with its benchmarks.Particularly,(1) we collect 80.59 hours of high-quality singing voices, forming the largest recorded singing dataset;(2) 20 professional singers across nine widely spoken languages offer diverse timbres and styles;(3) we provide controlled comparison and phoneme-level annotations of six commonly used singing techniques, helping technique modeling and control;(4) GTSinger offers realistic music scores, assisting real-world musical composition;(5) singing voices are accompanied by manual phoneme-to-audio alignments, global style labels, and 16.16 hours of paired speech for various singing tasks.Moreover, to facilitate the use of GTSinger, we conduct four benchmark experiments: technique-controllable singing voice synthesis, technique recognition, style transfer, and speech-to-singing conversion.
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
Zhang, Yu, Pan, Changhao, Guo, Wenxiang, Li, Ruiqi, Zhu, Zhiyuan, Wang, Jialei, Xu, Wenhao, Lu, Jingyu, Hong, Zhiqing, Wang, Chuxin, Zhang, LiChao, He, Jinzheng, Jiang, Ziyue, Chen, Yuxin, Yang, Chen, Zhou, Jiecheng, Cheng, Xinyu, Zhao, Zhou
The scarcity of high-quality and multi-task singing datasets significantly hinders the development of diverse controllable and personalized singing tasks, as existing singing datasets suffer from low quality, limited diversity of languages and singers, absence of multi-technique information and realistic music scores, and poor task suitability. To tackle these problems, we present GTSinger, a large Global, multi-Technique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks, along with its benchmarks. Particularly, (1) we collect 80.59 hours of high-quality singing voices, forming the largest recorded singing dataset; (2) 20 professional singers across nine widely spoken languages offer diverse timbres and styles; (3) we provide controlled comparison and phoneme-level annotations of six commonly used singing techniques, helping technique modeling and control; (4) GTSinger offers realistic music scores, assisting real-world musical composition; (5) singing voices are accompanied by manual phoneme-to-audio alignments, global style labels, and 16.16 hours of paired speech for various singing tasks. Moreover, to facilitate the use of GTSinger, we conduct four benchmark experiments: technique-controllable singing voice synthesis, technique recognition, style transfer, and speech-to-singing conversion. The corpus and demos can be found at http://gtsinger.github.io.