P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting Sungwon Kim 1,2, Kevin J Shih

Neural Information Processing Systems 

Our work proposes P-Flow, a fast and data-efficient zero-shot TTS model that uses speech prompts for speaker adaptation. P-Flow comprises a speech-prompted text encoder for speaker adaptation and a flow matching generative decoder for high-quality and fast speech synthesis.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found