DeepMind Unveils WaveNet - A Deep Neural Network for Speech and Audio Synthesis

Sep-20-2016, 10:35:53 GMT–#artificialintelligence

Google's DeepMind announced the WaveNet project, a fully convolutional, probabilistic and autoregressive deep neural network. It synthesizes new speech and music from audio and sounds more natural than the best existing Text-To-Speech (TTS) systems, according to DeepMind. Speech synthesis is largely based on concatenative TTS, where a database of short speech fragments are recorded from a single speaker and recombined to form speech. This approach isn't flexible and can't be adjusted to new voice inputs easily, often resulting in the need to completely rebuild a dataset when there's a desire to drastically alter existing voice properties. DeepMind notes that while previous models typically hinge around a large audio dataset from a single input source, or single person, WaveNet retains its models as sets of parameters that can be modified based on new input to an existing model.

artificial intelligence, machine learning, wavenet, (10 more...)

#artificialintelligence

Sep-20-2016, 10:35:53 GMT

News Web Page

Add feedback

Genre:
- Research Report (0.36)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)