Robots receive a scary-accurate new voice, courtesy of Google's DeepMind ExtremeTech

#artificialintelligence 

The WaveNet system can be thought of as an improvement upon concatenative text to speech, in that it still employs recordings of real human voices. But instead of chopping these up and reorganizing them in the old way, it uses an artificial neural network to generate synthetic utterances based upon the voices it was trained with. The downside is that this system is computationally intensive. Modeling raw audio typically requires 16,000 samples per second, with each sample being influenced by all the previous ones. This is well beyond the processing power of a typical smartphone, but not unthinkable for GPUs like Nvidia's DGX-1 deep learning supercomputer.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found