Text-to-Speech: One Small Step by Mankind to Create Lifelike Robots

Mar-27-2021, 17:45:14 GMT–#artificialintelligence

Note: For those of you who prefer watching videos, please feel free to play the above video on the same content. While speech synthesis has come a long way since Kratzenstein's vowel organ that could produce the five vowel sounds, it is a whole'nother level of challenge to transform text to natural-sounding speech. Recent developments in deep learning have provided us a new approach to the challenge and in this article, we shall briefly introduce a mainstream text-to-speech method before the deep learning era, then explore models like WaveNet that Google's text-to-speech API service is now using for lifelike speech synthesis. If you pause and think for a moment about how you can perform text-to-speech, you would probably formulate a method that is very similar to the concatenative approach. In concatenative text-to-speech, texts are broken down into smaller units such as phonemes, and the corresponding recordings of the units are then combined to form a complete speech.

convolution, create lifelike robot, wavenet, (15 more...)

#artificialintelligence

Mar-27-2021, 17:45:14 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Vision > Optical Character Recognition (1.00)
  - Speech > Speech Synthesis (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.73)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found