text-to-speech program
Who Is the Voice of Alexa?
If you've spoken to Amazon's Alexa voice assistant through the Alexa app or an Echo device, you may have wondered who the woman is behind the speaker. Do you think that Alexa is voiced by a celebrity or an Amazon employee? You might be surprised to know that the voice of Alexa is not formed from any real person. Rather, Alexa's voice is generated by artificial intelligence. Alexa's voice was developed using special software that evolved from text-to-speech technology.
Google's DeepMind AI fakes some of the most realistic human voices yet
WaveNet, as the system is called, generates voices by sampling real human speech and directly modeling audio waveforms based on it, as well as its previously generated audio. In Google's tests, both English and Mandarin Chinese listeners found WaveNet more realistic than other types of text-to-speech programs, although it was less convincing than actual human speech. The alternative is parametric text to speech -- building a completely computer-generated voice, using coded rules based on grammar or mouth sounds. Google's system is still based on real voice input.
Google DeepMind's AI can mimic realistic human speech
It's still pretty easy to tell whether it's a real person who's talking or a text-to-speech program. But there might come a time when a robot could dupe you into thinking that you're speaking with a real person, thanks to a new AI called WaveNet developed by Google's DeepMind team. They have a pretty good track record when it comes to building neural networks -- you probably know them as the folks who created AlphaGo, the AI that defeated one of the world's best Go players. Currently, developers use one of two methods to create speech programs. One involves using a large collection of words and speech fragments spoken by a single person, which makes sounds and intonations hard to manipulate.