lip read
Hearing aid technology that scans facial movements can lip read through masks
A hearing aid technology has been developed that scans your facial movements and uses an artificial intelligence (AI) to work out what is being said. Developed by engineers at the University of Glasgow, the system is even able to read the lips of people who are wearing a mask. The team trained algorithms with data collected by scanning people's faces with radar and Wi-Fi signals while they were speaking. This allowed the system to correctly interpret speech up to 95 per cent of the time for unmasked lips, and up to 83 per cent of the time with a mask. If integrated into hearing aids, it could help deaf and hard-of-hearing people to focus on sounds more easily in noisy environments.
Meta Says Its AI Can Lip Read to Boost Speech Recognition
The main technique that is used during face to face communication is speech, but this involves a lot more than just listening to the words that people say. Reading someone's lips can also be a crucial aspect of this since it can help you parse the meaning of their words in situations where you might not be able to hear them all that clearly, and that is something that Meta seems to be taking into account when it comes to their AI. A lot of studies have revealed that it would be a lot more difficult to understand whatever it is that someone is trying to say if you can't see the manner in which their mouth is moving. Meta has developed a new framework called AV-HuBERT that will take both factors into account because of the fact that this is the sort of thing that could potentially end up vastly improving its speech recognition potential, although it should be said that this is only a test at this point. What Meta is basically trying to do is to see if anything can be gained by allowing AI to read lips as well as listen to audio recordings and the like.
Google's AI can now lip read better than humans after watching thousands of hours of TV
The research follows similar work published by a separate group at the University of Oxford earlier this month. Using related techniques, these scientists were able to create a lip-reading program called LipNet that achieved 93.4 percent accuracy in tests, compared to 52.3 percent human accuracy. However, LipNet was only tested on specially-recorded footage that used volunteers speaking formulaic sentences. By comparison, DeepMind's software -- known as "Watch, Listen, Attend, and Spell" -- was tested on far more challenging footage; transcribing natural, unscripted conversations from BBC politics shows. More than 5,000 hours of footage from TV shows including Newsnight, Question Time, and the World Today, was used to train DeepMind's "Watch, Listen, Attend, and Spell" program.
Google's AI can now lip read better than humans after watching thousands of hours of TV
The research follows similar work published by a separate group at the University of Oxford earlier this month. Using related techniques, these scientists were able to create a lip-reading program called LipNet that achieved 93.4 percent accuracy in tests, compared to 52.3 percent human accuracy. However, LipNet was only tested on specially-recorded footage that used volunteers speaking formulaic sentences. By comparison, DeepMind's software -- known as "Watch, Listen, Attend, and Spell" -- was tested on far more challenging footage; transcribing natural, unscripted conversations from BBC politics shows.DeepMind's AI program was trained on 5,000 hours of TV More than 5,000 hours of footage from TV shows including Newsnight, Question Time, and the World Today, was used to train DeepMind's "Watch, Listen, Attend, and Spell" program. The videos included 118,000 difference sentences and some 17,500 unique words, compared to LipNet's test database of video of just 51 unique words.
Google's AI can now lip read better than humans after watching thousands of hours of TV
Researchers from Google's AI division DeepMind and the University of Oxford have used artificial intelligence to create the most accurate lip-reading software ever. Using thousands of hours of TV footage from the BBC, scientists trained a neural network to annotate video footage with 46.8 percent accuracy. That might not seem that impressive at first -- especially compared to AI accuracy rates when transcribing audio -- but tested on the same footage, a professional human lip-reader was only able to get the right word 12.4 percent of the time. The research follows similar work published a separate group at the University of Oxford earlier this month. Using related techniques, these scientist were able to create a lip-reading program called LipNet that achieved 93.4 percent accuracy in tests, compared to 52.3 percent human accuracy.
An artificial intelligence that lip reads better than humans. - IP EXPO Event Series
Scientists at Oxford University have developed a machine that can lip-read better than humans. The artificial intelligence system – LipNet – watches video of a person speaking and matches the text to the movement of their mouths with 93% accuracy, the researchers said. They suggested, automating the process could help millions. But experts say that more testing in real-life situations are needed to fully understand the benefits. Lip-reading is a notoriously tricky with professionals only able to decipher what someone is saying around 60% of the time.