Google develops on-device real-time speech recognition with new neural network technique
Google is rolling out an end-to-end on-device speech recognition technology entirely driven by neural networks for speech input in its Gboard virtual keyboard app. In a blog post, Google describes a recent paper which presents a new model trained with a recurrent neural network transducer (RNN-T) compact enough to run on a smartphone. According to "Streaming End-to-End Speech Recognition for Mobile Devices," end-to-end models directly predict character output based on speech input, and are good candidates for running speech recognition on edge devices. The Google research team found in its experiments that the RNN-T approach outperformed a conventional model based on connectionist temporal classification (CTC) in both latency and accuracy. Traditional speech recognition systems identify phonemes (sound units) from segments of audio, a model to connect phonemes into words, and a language model to analyze the likelihood of a given phrase, according to the blog.
Mar-27-2019, 12:31:49 GMT
- Technology: