japanese researcher develop multispeaker corpus
Sing Along! Japanese Researchers Develop Multispeaker Corpus for Singing Voice Synthesis
Machine learning algorithms excel at generating realistic photos, videos, and even voices. Last year, researchers at AI startup Dessa created a convincing fake audio file of popular American podcaster Joe Rogan's voice. In an Instagram post, Rogan responded to the highly realistic spoof: "At this point I've long ago left enough content out there that they could basically have me saying anything they want…" Although few-shot training may be changing this, Rogan was not wrong about the large voice library he has generated. Generally speaking, in ML the more training data the better, and this is also the case in voice synthesis. Although current machine learning techniques enable researchers to synthesize even singing voices at a similarly high quality, existing singing-voice datasets typically include only single singers.