Google Develops AI That Can Separate Voices in a Crowd
Google Research engineers have developed a deep learning system that can separate voices from audio-visual data recorded in crowded environments. The system they developed is the equivalent of the "cocktail party" effect, a feature of the human brain that can isolate and focus on one or more particular voices in a crowd. The system is designed to work with both audio and video data at the same time. Google says it created its novel tech by feeding it over 100,000 high-quality videos of lectures and talks hosted on YouTube. All talks were given by a single speaker, with minimal background noise. They trained the AI to recognize sounds based on lip/mouth movement.
Apr-23-2018, 16:15:49 GMT