Microsoft's AI generates high-quality talking heads from audio
A growing body of research suggests that the facial movements of almost anyone can be synced to audio clips of speech, given a sufficiently large corpus. In June, applied scientists at Samsung detailed an end-to-end model capable of animating the eyebrows, mouth, and eyelashes, and cheeks in a person's headshot. Only a few weeks later, Udacity revealed a system that automatically generates standup lecture videos from audio narration. And two years ago, Carnegie Mellon researchers published a paper describing an approach for transferring the facial movements from one person to another. Building on this and other work, a Microsoft Research team this week laid out a technique they claim improves the fidelity of audio-driven talking heads animations.
Oct-7-2019, 22:37:26 GMT
- Genre:
- Research Report > New Finding (0.37)
- Industry:
- Education (0.79)
- Technology:
- Information Technology > Artificial Intelligence
- Natural Language > Chatbot (0.64)
- Vision > Face Recognition (0.59)
- Information Technology > Artificial Intelligence