Deep Voice: Real-Time Neural Text-to-Speech for Production - Baidu Research

Mar-1-2017, 07:45:05 GMT–#artificialintelligence

Baidu Research presents Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks. The biggest obstacle to building such a system thus far has been the speed of audio synthesis – previous approaches have taken minutes or hours to generate only a few seconds of speech. We solve this challenge and show that we can do audio synthesis in real-time, which amounts to an up to 400X speedup over previous WaveNet inference implementations. Synthesizing artificial human speech from text, commonly known as text-to-speech (TTS), is an essential component in many applications such as speech-enabled devices, navigation systems, and accessibility for the visually-impaired. Fundamentally, it allows human-technology interaction without requiring visual interfaces.

artificial intelligence, deep voice, machine learning, (14 more...)

#artificialintelligence

Mar-1-2017, 07:45:05 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Synthesis (1.00)
  - Vision > Optical Character Recognition (0.91)
  - Machine Learning > Neural Networks
    - Deep Learning (0.76)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found