Google's DeepMind Claims Massive Progress in Synthesized Speech

Sep-9-2016, 17:35:24 GMT–#artificialintelligence

Researchers at Google's DeepMind artificial intelligence division claim to have come up with a way of producing much more natural-sounding synthesized speech, compared with the techniques that are currently in use. Existing text-to-speech (TTS) systems tend to use a system called concatenative TTS, where the audio is generated by recombining fragments of recorded speech. There's also a technique called parametric TTS that generates speech by passing information through a vocoder, but that sounds even less natural. So DeepMind has come up with a new technique called WaveNet that learns from the audio it's fed, and produces raw audio sample-by-sample. To give an idea of how detailed that is, we're talking at least 16,000 samples per second.

large language model, machine learning, natural language, (11 more...)

#artificialintelligence

Sep-9-2016, 17:35:24 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.92)
  - Machine Learning > Neural Networks
    - Deep Learning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found