Google's AI watched hours of TV to learn how to read lips better than you
Researchers from Google's UK-based artificial intelligence division DeepMind have collaborated with scientists from the University of Oxford to develop the world's most advanced lip-reading software – and it probably reads lips better than you. To accomplish this, the researchers fed thousands of hours of TV footage from the BBC to a neural network, training it to annotate videos based on mouth movement analysis with an accuracy of 46.8 percent. For context, when tasked with captioning the same video, a professional human lip-reader proved to be almost four times less efficient, accurately guessing the right word only 12.4 percent of the time. The research builds upon previously published work by the University of Oxford that used similar techniques to build a lip-reading app called LipNet that could read video recordings of volunteers speaking in simple sentences with an accuracy of over 90 percent. However, unlike Oxford's program, DeepMind's software – dubbed "Watch, Listen, Attend, and Spell" – was trained and tested on much more challenging footage.
Nov-26-2016, 15:50:16 GMT