N-best T5: Robust ASR Error Correction using Multiple Input Hypotheses and Constrained Decoding Space

Ma, Rao, Gales, Mark J. F., Knill, Kate M., Qian, Mengjie

Oct-10-2023–arXiv.org Artificial Intelligence

Error correction models form an important part of Automatic Speech Recognition (ASR) post-processing to improve the readability and quality of transcriptions. Most prior works use the 1-best ASR hypothesis as input and therefore can only perform correction by leveraging the context within one sentence. In this work, we propose a novel N-best T5 model for this task, which is fine-tuned from a T5 model and utilizes ASR N-best lists as model input. By transferring knowledge from the pre-trained language model and obtaining richer information from the ASR decoding space, the proposed approach outperforms a strong Conformer-Transducer baseline. Another issue with standard error correction is that the generation process is not well-guided. To address this a constrained decoding process, either based on the N-best list or an ASR lattice, is used which allows additional information to be propagated.

data quality, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Oct-10-2023

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language (1.00)
    - Representation & Reasoning (0.96)
    - Speech > Speech Recognition (1.00)
  - Data Science > Data Quality
    - Data Cleaning (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found