speech-generation breakthrough
Google's AI Brainiacs Achieve Speech-Generation Breakthrough
WaveNet won't have immediate commercial applications because the system requires too much computational power: it has to sample the audio signal it is being trained on 16,000 times per second or more, DeepMind said. And then for each of those samples it has to form a prediction about what the soundwave should look like based on each of the prior samples. Even the DeepMind researchers acknowledged in their blog post that this "is a clearly challenging task."
Google's DeepMind Achieves Speech-Generation Breakthrough
Google's DeepMind unit, which is working to develop super-intelligent computers, has created a system for machine-generated speech that it says outperforms existing technology by 50 percent. U.K.-based DeepMind, which Google acquired for about 400 million pounds ( 533 million) in 2014, developed an artificial intelligence called WaveNet that can mimic human speech by learning how to form the individual sound waves a human voice creates, it said in a blog post Friday. In blind tests for U.S. English and Mandarin Chinese, human listeners found WaveNet-generated speech sounded more natural than that created with any of Google's existing text-to-speech programs, which are based on different technologies. WaveNet still underperformed recordings of actual human speech. Many computer-generated speech programs work by using a large data set of short recordings of a single human speaker and then combining these speech fragments to form new words.