Alexa speech normalization AI reduces errors by up to 81%


Text normalization is a fundamental processing step in most natural language systems. In the case of Amazon's Alexa, "Book me a table at 5:00 p.m." might be transcribed by the assistant's automatic speech recognizer as "five p m" and further reformatted to "5:00PM." Then again, Alexa might convert "5:00PM" to "five thirty p m" for its text-to-speech synthesizer. So how does this work? Currently, Amazon's voice assistant relies on "thousands" of handwritten normalization rules for dates, email addresses, numbers, abbreviations, and other expressions, according to Alexa AI group applied scientist Ming Sun and Alexa Speech machine learning scientist Yuzong Liu.