Baidu's Deep Voice can quickly synthesize realistic human speech

Mar-9-2017, 08:00:06 GMT–Engadget

Google's WaveNet can also synthesize realistic human speech, but it's quite computationally demanding and hard to use for real-world applications at this point. Baidu says it solved WaveNet's problem by using deep-learning techniques to convert text to phenomes, the smallest unit of speech. It then turns those phonemes into sounds using its speech synthesis network. The system converts the word "hello," for instance, into "(silence HH), (HH, EH), (EH, L), (L, OW), (OW, silence)" before the speech network pronounces it. Both steps rely on deep learning and don't need human input.

artificial intelligence, machine learning, synthesize realistic human speech, (4 more...)

Engadget

Mar-9-2017, 08:00:06 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found