Meta claims its AI improves speech recognition quality by reading lips

Jan-7-2022, 18:46:19 GMT–#artificialintelligence

People perceive speech both by listening to it and watching the lip movements of speakers. In fact, studies show that visual cues play a key role in language learning. By contrast, AI speech recognition systems are built mostly -- or entirely -- on audio. And they require a substantial amount of data to train, typically ranging in the tens of thousands of hours of recordings. To investigate whether visuals -- specifically footage of mouth movement -- can improve the performance of speech recognition systems, researchers at Meta (formerly Facebook) developed Audio-Visual Hidden Unit BERT (AV-HuBERT), a framework that learns to understand speech by both watching and hearing people speak.

artificial intelligence, av-hubert, machine learning, (16 more...)

#artificialintelligence

Jan-7-2022, 18:46:19 GMT

News Web Page

Add feedback

Country:
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.06)

Industry:
- Health & Medicine > Therapeutic Area (0.73)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found