Soon You Won't Be Able to Tell an AI From a Human Voice

Sep-14-2016, 17:10:25 GMT–#artificialintelligence

The choppy, cybernetic voices of digital assistants like Siri may not sound so mechanical for much longer, thanks to a significant breakthrough in using artificial intelligence to generate realistic human speech. In a new paper, scientists at Google-owned AI shop DeepMind have unveiled WaveNet, a neural network that generates audio waveforms by predicting and adapting to its own output in real-time. The result is dramatically more natural-sounding computerized speech, which the researchers say reduces the perceived gap between human and computer voices speaking both English and Chinese by over 50 percent. The system's predictive model is a far cry from the synthesized speech systems used by "digital assistant" apps like Siri. Instead of using a "concatenative" speech system that pieces together from a library of speech fragments recorded by one speaker (in Siri's case, voice actress Susan Bennett), WaveNet is trained on a massive database, then generates raw waveforms one audio sample at time using what's known as an "autoregressive" model--meaning each individual frame of the waveform is predicted based on the frames that preceded it. The neural net was developed from a similar model called PixelCNN, which does the same for computer vision by predicting images one pixel at a time.

artificial intelligence, machine learning, natural language, (9 more...)

#artificialintelligence

Sep-14-2016, 17:10:25 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Personal Assistant Systems (1.00)
  - Natural Language (1.00)
  - Machine Learning > Neural Networks (0.97)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found