Browse through your Facebook News Feed and you'll see photos play a prominent part, meaning visually impaired users are missing out on a lot of updates from their friends. Now Facebook's engineers have harnessed the power of an artificial intelligence network to describe these pictures to blind or partially blind users. Facebook is calling the system "automatic alternative text" and it's based on a neural network primed with billions of parameters and millions of examples. Such neural networks – vast, complex databases designed to mimic the human brain as closely as possible – are playing an increasingly important role in modern computing. The AI software doesn't actually "see" the picture, but it can compare the objects in it with its vast internal database of similar photos and make an educated guess about what's being shown.
Facebook is reaching out to the more than 280 million visually impaired people on the planet with a new technology called automatic alternative text, which promises to help the blind community experience the social network in the same way others enjoy it. Automatic alternative text, or automatic alt text, is a described by Facebook engineers as a new development that generates a description of a photo using advancements in object recognition technology. The company detailed the technology in an April 4 blog post written by Shaomei Wu, a Facebook software engineer; Hermes Pique, software engineer on iOS; and Jeffrey Wieland, Facebook's head of accessibility. The reason that an Apple engineer is involved is that automatic alt text will be first available on iOS screen readers set to English. However, the blog post notes that Facebook plans to add this functionality for other languages and platforms soon.
For many of us, accessing our favorite social media sites is as easy as opening our browser or mobile app. We log in and expect it to simply work. But for blind and visually impaired users who work with screen reader software, accessibility is far from simple. With websites like Facebook, accessibility has to be coded into the back end. Their content must be able to be interpreted by a screen reader, which outputs the content either by reading it out loud, or by showing it on a refreshable Braille display.
The team behind the code have been working on it for more than ten months. "At its core, the engine is a deep convolutional neural network with millions of learnable parameters," read Facebook's detailed description of how alt text works. Once the AI has identified what is in the image it needs to be able to inform the user. "For each photo, we first report the number of people (approximated by the number of faces) in the photos, and whether they are smiling or not; we then list all the objects we detect, ordered by the detection algorithm's confidence; scenes, such as settings and properties of the entire image (e.g., indoor, outdoor, selfie, meme), will be presented at the end," Facebook explained. The feature has been launched initially in the UK, US, Australia, and New Zealand, with Facebook saying it will roll out the feature to different languages and platforms "soon".
If you forget to tag or add a description when uploading a photo or gallery to Facebook, it can be tough to find an image when you need it. Or at least it used to be. The social network revealed today that it built an AI image search system that can "see" things in your photos even when you forget to add the aforementioned identifiers. Facebook says the system uses its Lumos platform to understand the content of photos and videos and quickly sort through the items you've uploaded. This means that if even if you can't remember when a photo was taken, if you remember the content, you might still be able to find it with ease.