Voice-controlled technology like Amazon Echo, Siri or hands-free features in Google Maps are things we're starting to take for granted. But as Mary Meeker's 2017 Internet Trends Report noted, voice controls are changing computer-human interfaces, and industries, broadly. Speech recognition or voice controls are being added to medical devices and business applications, even vehicles and industrial robotics. But there's a problem -- voice systems have been built for standard speech today. That leaves out millions of people who live with speech impairments, or who just have a strong accent.
While voice-enabled assistants like Siri and Alexa have made the lives of millions of Americans a little easier, the software systems they run on are not great at accommodating a particular group of users: those with speech disabilities and impairments. This means that the "7.5 million people" who "have trouble using their voice" and the "more than 3 million people" who stutter in the U.S. are largely being left out of the voice-assistant revolution. This lack of accessibility becomes even more glaring when you consider that many individuals with speech disabilities also have limited mobility and motor skills, meaning they might benefit more from such digital assistants. Moira Corcoran reports on the smaller tech companies and startups that have started to work on software that's more inclusive of all speech, and what larger firms like Amazon and Microsoft have to say about making more individualized and accessible technologies. Elsewhere on Slate, we've been focusing on the politics of social media.
Artificial intelligence is at the root of several entirely new platforms on which customers and companies can interact. Voice augmented reality and chatbots are powered by natural language processing, computer vision, and machine learning AI algorithms. Each technology offers considerable opportunities for companies to deliver a more personal, useful, and relevant service to their customers. Voice-controlled user interfaces have been around since 1952 when Bell Labs produced Audrey, a machine that could understand spoken numbers. But the current wave of voice technology was started by Amazon just a couple of years ago.
Devices and tools activated through speaking will soon be the primary way people interact with technology, yet none of the main voice assistants, including Amazon's Alexa, Apple's Siri and Google Assistant, support a single native African language. Mozilla has sought to address this problem through the Common Voice project, which is now working to expand voice technology to the 100 million people who speak Kiswahili across Kenya, Uganda, Tanzania, Rwanda, Burundi and South Sudan. The open source project makes it easy for anyone to donate their voice to a publicly available database that can then be used to train voice-enabled devices, and over the past two years, more than 840 Rwandans have donated over 1,700 hours of voice data in Kinyarwanda, a language with over 12 million speakers. That voice data is now being used to help train voice chatbots with speech-to-text and text-to-speech functionality that has important information about COVID-19, according to Chenai Chair, special advisor for Africa Innovation at the Mozilla Foundation. A handful of major tech companies control the voice data that is currently used to train machine learning algorithms, posing a challenge for companies seeking to develop high-quality speech recognition technologies while also exacerbating the voice recognition divide between English speakers and the rest of the world.
In 2015, 1.7 million voice-first devices were shipped across the U.S. But the number soon rose up to 6.5 million in 2016. The increase in the trend captured the growing demand for voice-search in the coming years. Voice-search technology has existed for many years but its evolution has just begun. From automated voice recognition phone system to simplified voice to text dictaphones, voice technology was adopted in different forms all across the globe.