A model that can recognize speech in different languages from a speaker's lip movements
In recent years, deep learning techniques have achieved remarkable results in numerous language and image-processing tasks. This includes visual speech recognition (VSR), which entails identifying the content of speech solely by analyzing a speaker's lip movements. While some deep learning algorithms have achieved highly promising results on VSR tasks, they were primarily trained to detect speech in English, as most existing training datasets only include English speech. This limits their potential user base to people who live or work in English-speaking contexts. Researchers at Imperial College London have recently developed a new model that can tackle VSR tasks in multiple languages.
Nov-26-2022, 01:55:12 GMT
- Technology: