Goto

Collaborating Authors

Speech Recognition


Still talking to Cortana? Microsoft gives you more control over how your voice recordings are used

ZDNet

Users of Microsoft's voice-enabled services such as Cortana will now be able to decide whether or not the audio recordings of their interactions can be used by the company to improve its speech recognition algorithms. By default, customers' voice clips will not be contributed for review, said Microsoft in a new blog post; instead, users will be required to actively opt in to allow the company to store and access their audio recordings. Customers who have chosen to remain opted out will still be able to use all of Microsoft's voice-enabled products and services, confirmed the company. Their audio recordings won't be stored, but Microsoft will still have access to some information associated with voice activity, such as the transcriptions automatically generated during user interactions with speech recognition AI. If and once they have opted in, however, users' voice data might be listened to by Microsoft employees and contractors as part of a process to refine the AI systems used to power speech recognition technology.


Amazon launches new service that lets auto, device makers build customized voice assistants

ZDNet

Amazon is rolling out a new service that gives companies access to Alexa's AI smarts to build their own voice assistants. Amazon said the Alexa Custom Assistant lets automakers and device manufacturers create voice assistants that are built on Alexa technology and work in cooperation with Alexa. The company said this is the first time it's offering this type of access to its voice AI technology. Automaker Fiat Chrysler is the first to integrate Alexa Custom Assistant into its vehicles. Among the key features of the service, Amazon is highlighting what it calls an industry-first capability of simultaneous multi-assistant cooperation.


Google now owns Fitbit

Engadget

Google has completed its $2.1 billion purchase of Fitbit, more than a year after the deal was first announced. The EU approved the acquisition in late December, clearing the way towards Google's ownership over what is perhaps the best-known brand out there for mainstream fitness-tracking devices. Fitbit co-founder and CEO James Park reiterated in a letter today that Fitbit would continue to be device-agnostic, making products that work with both iPhones and Android devices. Both Park and Google's Rick Osterloh also reiterated that this deal was always about "devices, not data." That's shorthand for Google and Fitbit's pledge to keep user data private going forward; Park said that "Fitbit users' health and wellness data won't be used for Google ads and this data will be kept separate from other Google ad data."


Dolby promises better call quality with Dolby Voice for PCs

Engadget

Dolby is looking to help your audio and video calls sound as clear as possible by optimizing microphone and speaker performance. It has announced a tool called Dolby Voice for PCs, which it says removes unwanted background noise and echoes, while automatically adjusting levels for voices that are quiet or far away from a microphone. Voice calls with several participants can get somewhat chaotic if people are talking over each other, particularly given that many PCs only support mono-based communications. However, if your conference call app offers stereo audio, Dolby says it can separate the voices to make them clearer and more natural-sounding. The company claims Dolby Voice for PCs can also improve speech recognition intelligibility for voice assistants.


Update your home and kitchen with Amazon's New Year, New You sale

Mashable

Here are the best home and kitchen deals from Amazon's New Year, New You sale: OUR TOP PICK: iRobot Roomba 981 Robot Vacuum -- save $159.01 Whether you made a resolution or not, the new year is always a good time to assess your space and decide what could use an upgrade. Amazon's New Year, New You sale offers the perfect chance to grab items that will set you up for success in 2021, from air fryers that will help you cut calories to smart home gadgets that'll keep your home safe. We have rounded up the best deals from the home and kitchen refresh category that will spark some new-year inspiration and make your life easier. If you want your home to be cleaner in 2021, but you don't necessarily want to do more cleaning, a robot vacuum will solve all your problems. This Roomba is the mini housekeeper of your dreams, and features super strong suction, home mapping, smart navigation, and voice-control capabilities.


TP-Link adds voice control to its newest mesh WiFi router

Engadget

TP-Link has today announced an updated version of its Deco mesh networking gear that now has voice control, through Amazon Alexa. The Deco Voice X20 packs in a smart speaker in every satellite point that enables users to control the smart parts of their home without buying more Echo Dots. The two pack you can buy at retail is said to cover 4,000 square feet in WiFi 6, with truly "seamless roaming." The hardware is pretty interesting to look at, too, with a white cylinder floating on a hot-rod red base. Mesh networks rely upon gadgets being strewn around your home in prominent places, not hidden behind cupboards. In order to encourage this, device makers have both made their gear look better, but also do more to ensure that they find a place in your heart.


Over 100 Voice AI Predictions for 2021 from 50 Industry Leaders - Voicebot.ai

#artificialintelligence

This represents our fifth annual Voice AI predictions article and there is no question it is the most interesting and insightful to date. It is also the largest with over 100 predictions from 50 voice industry leaders. You will not that some of our guest contributors are confident enough to make multiple predictions. It increases the odds at least one of them will be correct. What is striking for our 2021 issue is the breadth of predictions and the interesting insights. The industry is simply more mature, has seen more, and has a better grasp on what is coming. I enjoyed reading this year's predictions and am sure you will as well. Despite the breadth of topics covered, there are at least two topics that arose with meaningfully higher frequency than the others. Predictions related to the rise of custom assistants were mentioned by at least 11 contributors followed by an increase focus on voice solution while on-the-go. Personalization, both in terms of the user preference and emotion recognition or empathy, and a rise in multimodal user experiences were next in line mentioned by about 10% of the contributors. After that, a number of topics showed some popularity ranging from more rapid voice AI adoption in customer service (including virtual humans) to more growth in voice assistant features for audio media. It was interesting to see two guests (Audrey Arbeeny and Kirill Petrov) mention an expected rise of voice assistant and custom synthetic voices in games, a couple who are optimistic about Apple making a big Siri update this year (Brian Roemmele and Max Child), and how AR might spur voice adoption (Joan Palmiter Bajorek and Craig Sanders).


You can open LG's new refrigerators with your voice

Engadget

Voice-controlled refrigerators are already a practical reality, but LG might just make them particularly handy. The tech giant has introduced a 2021 line of InstaView fridges that can open the door with a voice command. That's useful when you have your hands full with groceries, of course, but it could also be helpful during a pandemic where you have to be mindful of what you touch. You can expect more conventional voice features like Amazon Dash replenishment, checking the ice and water dispensers or asking about the day's schedule. The new models also add UV light-based disinfection for water dispenser taps (again, helpful in a pandemic).


A learning perspective on the emergence of abstractions: the curious case of phonemes

arXiv.org Machine Learning

In the present paper we use a range of modeling techniques to investigate whether an abstract phone could emerge from exposure to speech sounds. We test two opposing principles regarding the development of language knowledge in linguistically untrained language users: Memory-Based Learning (MBL) and Error-Correction Learning (ECL). A process of generalization underlies the abstractions linguists operate with, and we probed whether MBL and ECL could give rise to a type of language knowledge that resembles linguistic abstractions. Each model was presented with a significant amount of pre-processed speech produced by one speaker. We assessed the consistency or stability of what the models have learned and their ability to give rise to abstract categories. Both types of models fare differently with regard to these tests. We show that ECL learning models can learn abstractions and that at least part of the phone inventory can be reliably identified from the input.


cif-based collaborative decoding for end-to-end contextual speech recognition

arXiv.org Artificial Intelligence

End-to-end (E2E) models have achieved promising results on multiple speech recognition benchmarks, and shown the potential to become the mainstream. However, the unified structure and the E2E training hamper injecting contextual information into them for contextual biasing. Though contextual LAS (CLAS) gives an excellent all-neural solution, the degree of biasing to given context information is not explicitly controllable. In this paper, we focus on incorporating context information into the continuous integrate-and-fire (CIF) based model that supports contextual biasing in a more controllable fashion. Specifically, an extra context processing network is introduced to extract contextual embeddings, integrate acoustically relevant context information and decode the contextual output distribution, thus forming a collaborative decoding with the decoder of the CIF-based model. Evaluated on the named entity rich evaluation sets of HKUST/AISHELL-2, our method brings relative character error rate (CER) reduction of 8.83%/21.13% and relative named entity character error rate (NE-CER) reduction of 40.14%/51.50% when compared with a strong baseline. Besides, it keeps the performance on original evaluation set without degradation.