HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis Sang-Hoon Lee 1 Seung-Bin Kim 2 Ji-Hyun Lee 2
–Neural Information Processing Systems
HierSpeech-U can adapt to a novel speaker by utilizing self-supervised speech representations without text transcripts.
Neural Information Processing Systems
Aug-15-2025, 14:05:46 GMT