Google Upgrades Its Speech-to-Text Service with Tailored Deep-Learning Models


A month after Google announced breakthroughs in Text-to-Speech generation technologies stemming from the Magenta project, the company followed through with a major upgrade of its Speech-to-Text API cloud service. The updated service leverages deep-learning models for speech transcription that are tailored to specific use-cases: short voice commands, phone calls and video, with a default model in all other contexts. The upgraded service now handles 120 languages and variants with different model availability and feature levels. Business applications range from over-the-phone meetings, to call-centers and video transcription. Transcription accuracy is improved in the presence of multiple speakers and significant background noise.