Baidu's Deep Voice can clone speech with less than four seconds of training Computing


With only a few seconds of audio, the'Deep Voice' software developed by China's Baidu is able to clone a human voice - raising fears about the security of biometrics. Baidu has been working on Deep Voice for over a year, and had already managed to reproduce speaker identities with about half an hour of training data. With new developments, it has lowered that time to 3.7 seconds. A believable, if low-quality, false voice can now be produced from a only single sentence of speech. Of course, more training leads to higher-quality results, especially if there is more than one sample to learn from.