Guided-TTS: Text-to-Speech with Untranscribed Speech - Technology Org

Nov-27-2021, 10:05:57 GMT–#artificialintelligence

Neural text-to-speech (TTS) models are successfully used to generate high-quality human-like speech. However, most TTS models can be trained if only the transcribed data of the desired speaker is given. That means that long-form untranscribed data, such as podcasts, cannot be used to train existing models. A recent paper on arXiv proposes an unconditional diffusion-based generative model. It is trained on untranscribed data that leverages a phoneme classifier for text-to-speech synthesis.

artificial intelligence, optical character recognition, untranscribed speech, (11 more...)

#artificialintelligence

Nov-27-2021, 10:05:57 GMT

News Web Page

Add feedback

Genre:
- Research Report (0.36)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Synthesis (1.00)
  - Vision > Optical Character Recognition (0.98)
  - Assistive Technologies (0.98)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found