Careless Whisper: Speech-to-Text Hallucination Harms
Koenecke, Allison, Choi, Anna Seo Gyeong, Mei, Katelyn, Schellmann, Hilke, Sloane, Mona
–arXiv.org Artificial Intelligence
Use of such speech-to-text APIs is increasingly prevalent in high-stakes downstream applications, ranging from surveillance of incarcerated people [22] to medical care [14]. While such speech-to-text APIs can generate written transcriptions more quickly than human transcribers, there are grave concerns regarding bias in automated transcription accuracy, e.g., underperformance for African American English speakers [11] and speakers with speech impairments such as dysphonia [12]. These biases within APIs can perpetuate disparities when real-world decisions are made based on automated speech-to-text transcriptions--from police making carceral judgements to doctors making treatment decisions. OpenAI released its Whisper speech-to-text API in September 2022 with experiments showing better speech transcription accuracy relative to market competitors [19]. We evaluate Whisper's transcription performance on the axis of "hallucinations," defined as undesirable generated text "that is nonsensical, or unfaithful to the provided source input" [10]. Our approach compares the ground truth of a speech snippet with the outputted transcription; we find hallucinations in roughly 1% of transcriptions generated in mid-2023, wherein Whisper hallucinates entire made-up sentences when no one is speaking in the input audio files. While hallucinations have been increasingly studied in the context of text generated by ChatGPT (a language model also made by OpenAI) [8, 10], hallucinations have only been considered in speech-to-text models as a means to study error prediction [21], and not as a fundamental concern in and of itself. In this paper, we provide experimental quantification of Whisper hallucinations, finding that nearly 40% of the hallucinations are harmful or concerning in some way (as opposed to innocuous and random).
arXiv.org Artificial Intelligence
Feb-12-2024
- Country:
- Europe
- Czechia > South Moravian Region
- Brno (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Czechia > South Moravian Region
- North America > United States
- Illinois > Cook County
- Chicago (0.04)
- New York (0.04)
- Virginia (0.04)
- Illinois > Cook County
- Europe
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine > Therapeutic Area (1.00)
- Information Technology (1.00)
- Technology: