Natural Language Misunderstanding

Communications of the ACM 

In today's world, it is nearly impossible to avoid voice-controlled digital assistants. From the interactive intelligent agents used by corporations, government agencies, and even personal devices, automated speech recognition (ASR) systems, combined with machine learning (ML) technology, increasingly are being used as an input modality that allows humans to interact with machines, ostensibly via the most common and simplest way possible: by speaking in a natural, conversational voice. Yet as a study published in May 2020 by researchers from Stanford University indicated, the accuracy level of ASR systems from Google, Facebook, Microsoft, and others vary widely depending on the speaker's race. While this study only focused on the differing accuracy levels for a small sample of African American and white speakers, it points to a larger concern about ASR accuracy and phonological awareness, including the ability to discern and understand accents, tonalities, rhythmic variations, and speech patterns that may differ from the voices used to initially train voice-activated chatbots, virtual assistants, and other voice-enabled systems. The Stanford study, which was published in the journal Proceedings of the National Academy of Sciences, measured the error rates of ASR technology from Amazon, Apple, Google, IBM, and Microsoft, by comparing the system's performance in understanding identical phrases (taken from pre-recorded interviews across two datasets) spoken by 73 black and 42 white speakers, then comparing the average word error rate (WER) for black and white speakers.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found