Google is rolling out an end-to-end on-device speech recognition technology entirely driven by neural networks for speech input in its Gboard virtual keyboard app. In a blog post, Google describes a recent paper which presents a new model trained with a recurrent neural network transducer (RNN-T) compact enough to run on a smartphone. According to "Streaming End-to-End Speech Recognition for Mobile Devices," end-to-end models directly predict character output based on speech input, and are good candidates for running speech recognition on edge devices. The Google research team found in its experiments that the RNN-T approach outperformed a conventional model based on connectionist temporal classification (CTC) in both latency and accuracy. Traditional speech recognition systems identify phonemes (sound units) from segments of audio, a model to connect phonemes into words, and a language model to analyze the likelihood of a given phrase, according to the blog.
The period between the 1970s and the early 2000s was known as the "dark age of artificial intelligence." Due to less sophisticated computational power, slow processing and a lack of data, what was once thought of as a technology that could change human lives turned out to be no more than an idea for science fictions, a concept that didn't live up to its full potential. Yet, with the concept of machine learning taking shape in the early 2000s, things have changed. Machines that learn by themselves to gain intelligence – so much so that even a top human go player was defeated by Google's AlphaGo earlier this year – have revived an interest in AI, prompting companies to devote huge R&D resources into this area. In fact, the growth potential for AI is not to be ignored.
Amazon's other voice-controlled Fire TV devices require a push of the remote's mic button or a separate Echo device with Amazon's Alexa voice assistant. Unlike the other devices, the Cube will let viewers switch between streaming services like Netflix and regular cable channels with such voice commands as "Alexa, turn on ESPN." The new device can also do typical Alexa tasks, such as playing "Jeopardy!" or fetching the weather, even when the TV is off.
Google said it is rolling out its visual assistant, which brings ups information as well as ways to interact with apps with a Google Assistant voice request in a full-screen experience, to Android phones this summer. When an Android user makes a query through Google Assistant, Google will provide a more interactive visual experience on the phone. That includes ways to interact with smart home products, like thermostats, or interacting directly with apps like the Starbucks app. Google's visual assistant is coming to iOS devices this year. You can make a voice query such as "what is the temperature right now," and a display shows up with a way to change the temperature.