Personalization for BERT-based Discriminative Speech Recognition Rescoring

Kolehmainen, Jari, Gu, Yile, Gourav, Aditya, Shivakumar, Prashanth Gurunath, Gandhe, Ankur, Rastrow, Ariya, Bulyko, Ivan

Jul-13-2023–arXiv.org Artificial Intelligence

Recognition of personalized content remains a challenge in end-to-end speech recognition. We explore three novel approaches that use personalized content in a neural rescoring step to improve recognition: gazetteers, prompting, and a cross-attention based encoder-decoder model. We use internal de-identified en-US data from interactions with a virtual voice assistant supplemented with personalized named entities to compare these approaches. On a test set with personalized named entities, we show that each of these approaches improves word error rate by over 10%, against a neural rescoring baseline. We also show that on this test set, natural language prompts can improve word error rate by 7% without any training and with a marginal loss in generalization. Overall, gazetteers were found to perform the best with a 10% improvement in word error rate (WER), while also improving WER on a general test set by 1%.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

Jul-13-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.94)
  - Natural Language (1.00)
  - Representation & Reasoning (1.00)
  - Speech > Speech Recognition (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found