There are as many as 1.5 billion English speaking people in the world, including those who speak English as a second language. That may sound like a lot, but that means four out of every five people do not speak English. Therefore, any speech recognition or natural language technology that is built primarily for English speakers will be missing out on 5.9 billion potential customers. That is a big opportunity; but with 6,500 spoken languages still in use throughout the world, it is also a very big challenge. Speech technology has solid roots in American research.
Google is adding a multitude of languages to its speech recognition capabilities. The expansion will cover 30 international languages and local dialects, including India and Africa's emerging regions. This will increase the total number of supported languages to 119. The update will include eight additional Indian languages and two African languages, Amharic and Swahili.
Google AI researchers are applying computer vision to sound wave visuals to achieve state-of-the-art speech recognition system performance without the use of a language model. Researchers say the SpecAugment method requires no additional data and can be used without adaption of underlying language models. "An unexpected outcome of our research was that models trained with SpecAugment out-performed all prior methods even without the aid of a language model," Google AI resident Daniel S. Park and research scientist William Chan said in a blog post today. "While our networks still benefit from adding a language model, our results are encouraging in that it suggests the possibility of training networks that can be used for practical purposes without the aid of an language model." SpecAugment works in part by applying visual analysis data augmentation to spectrograms, visual representations of speech.
Co-located in Silicon Valley and Beijing, Baidu Research brings together top talent from around the world to focus on future-looking fundamental research in artificial intelligence. Our research directions include deep learning, computer vision, speech recognition and synthesis, natural language understanding, data mining and knowledge discovery, business intelligence, artificial general intelligence, high performance computing, robotics and autonomous driving. At Baidu Research, we aim to revolutionize human-machine interfaces with the latest artificial intelligence techniques. Our Deep Voice project was [...] The AAAI (Association for the Advancement of Artificial Intelligence) is one of the world's premiere artificial conferences, with annual summits [...] Today, we are excited to announce the hiring of three world-renowned artificial intelligence scientists, Dr. Kenneth Church, Dr. Jun Huan [...]