The Morning After: Microsoft's VALL-E AI can replicate a voice from a three-second sample
While there are already multiple services that can create copies of your voice, they usually demand substantial input. Microsoft claims its model can simulate someone's voice from just a three-second audio sample. The speech can match both the timbre and emotional tone of the speaker – even the acoustics of a room. It could one day be used for customized or high-end text-to-speech applications, but like deepfakes, there are risks of misuse. Researchers trained VALL-E on 60,000 hours of English language speech from 7,000-plus speakers in Meta's Libri-Light audio library.
Jan-11-2023, 12:16:05 GMT
- Country:
- North America > United States (0.17)
- Industry:
- Information Technology (0.37)
- Technology:
- Information Technology > Artificial Intelligence > Vision (0.58)