HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis Sang-Hoon Lee 1 Seung-Bin Kim 2 Ji-Hyun Lee 2

shlee

Neural Information Processing Systems 

HierSpeech-U can adapt to a novel speaker by utilizing self-supervised speech representations without text transcripts.