An AI has learned how to pick a single voice out of a crowd
Devices like Amazon's Echo and Google Home can usually deal with requests from a lone person, but like us they struggle in situations such as a noisy cocktail party, where several people are speaking at once. Now an AI that is able to separate the voices of multiple speakers in real time promises to give automatic speech recognition a big boost, and could soon find its way into an elevator near you. The technology, developed by researchers at the Mitsubishi Electric Research Laboratory in Cambridge, Massachusetts, was demonstrated in public for the first time at this month's Combined Exhibition of Advanced Technologies show in Tokyo. It uses a machine learning technique the team calls "deep clustering" to identifies unique features in the "voiceprint" of multiple speakers. It then groups the distinct features from each speaker's voice together, allowing it to disentangle multiple voices and then reconstruct what each person was saying.
Oct-24-2017, 16:31:21 GMT
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.27)
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.27)
- Asia > Japan
- Industry:
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Natural Language > Chatbot (0.59)
- Representation & Reasoning > Personal Assistant Systems (0.59)
- Speech > Speech Recognition (0.96)
- Information Technology > Artificial Intelligence