AITopics | error correction model

Collaborating Authors

error correction model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition

Shu, Yuchun, Hu, Bo, He, Yifeng, Shi, Hao, Wang, Longbiao, Dang, Jianwu

arXiv.org Artificial IntelligenceJun-29-2024

Accurately finding the wrong words in the automatic speech recognition (ASR) hypothesis and recovering them well-founded is the goal of speech error correction. In this paper, we propose a non-autoregressive speech error correction method. A Confidence Module measures the uncertainty of each word of the N-best ASR hypotheses as the reference to find the wrong word position. Besides, the acoustic feature from the ASR encoder is also used to provide the correct pronunciation references. N-best candidates from ASR are aligned using the edit path, to confirm each other and recover some missing character errors. Furthermore, the cross-attention mechanism fuses the information between error correction references and the ASR hypothesis. The experimental results show that both the acoustic and confidence references help with error correction. The proposed system reduces the error rate by 21% compared with the ASR model.

correction, error correction, information, (13 more...)

arXiv.org Artificial Intelligence

2407.12817

Country:

Asia > China > Tianjin Province > Tianjin (0.05)
Europe > Spain > Catalonia > Tarragona Province > Tarragona (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Tag and correct: high precision post-editing approach to correction of speech recognition errors

Ziętkiewicz, Tomasz

arXiv.org Artificial IntelligenceJun-11-2024

This paper presents a new approach to the problem of correcting speech recognition errors by means of post-editing. It consists of using a neural sequence tagger that learns how to correct an ASR (Automatic Speech Recognition) hypothesis word by word and a corrector module that applies corrections returned by the tagger. The proposed solution is applicable to any ASR system, regardless of its architecture, and provides high-precision control over errors being corrected. This is especially crucial in production environments, where avoiding the introduction of new mistakes by the error correction model may be more important than the net gain in overall results. The results show that the performance of the proposed error correction models is comparable with previous approaches while requiring much smaller resources to train, which makes it suitable for industrial applications, where both inference latency and training times are critical factors that limit the use of other techniques.

computational linguistic, correction, edit operation, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.15439/2022F168

2406.07589

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Poland > Greater Poland Province > Poznań (0.05)
(5 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition

Gu, Zijin, Likhomanenko, Tatiana, Bai, He, McDermott, Erik, Collobert, Ronan, Jaitly, Navdeep

arXiv.org Artificial IntelligenceMay-24-2024

Language models (LMs) have long been used to improve results of automatic speech recognition (ASR) systems, but they are unaware of the errors that ASR systems make. Error correction models are designed to fix ASR errors, however, they showed little improvement over traditional LMs mainly due to the lack of supervised training data. In this paper, we present Denoising LM (DLM), which is a $\textit{scaled}$ error correction model trained with vast amounts of synthetic data, significantly exceeding prior attempts meanwhile achieving new state-of-the-art ASR performance. We use text-to-speech (TTS) systems to synthesize audio, which is fed into an ASR system to produce noisy hypotheses, which are then paired with the original texts to train the DLM. DLM has several $\textit{key ingredients}$: (i) up-scaled model and data; (ii) usage of multi-speaker TTS systems; (iii) combination of multiple noise augmentation strategies; and (iv) new decoding techniques. With a Transformer-CTC ASR, DLM achieves 1.5% word error rate (WER) on $\textit{test-clean}$ and 3.3% WER on $\textit{test-other}$ on Librispeech, which to our knowledge are the best reported numbers in the setting where no external audio data are used and even match self-supervised methods which use external audio data. Furthermore, a single DLM is applicable to different ASRs, and greatly surpassing the performance of conventional LM based beam-search rescoring. These results indicate that properly investigated error correction models have the potential to replace conventional LMs, holding the key to a new level of accuracy in ASR systems.

baseline asr, dlm, speech recognition, (13 more...)

arXiv.org Artificial Intelligence

2405.15216

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adapting an Unadaptable ASR System

Ma, Rao, Qian, Mengjie, Gales, Mark J. F., Knill, Kate M.

arXiv.org Artificial IntelligenceOct-10-2023

As speech recognition model sizes and training data requirements grow, it is increasingly common for systems to only be available via APIs from online service providers rather than having direct access to models themselves. In this scenario it is challenging to adapt systems to a specific target domain. To address this problem we consider the recently released OpenAI Whisper ASR as an example of a large-scale ASR system to assess adaptation methods. An error correction based approach is adopted, as this does not require access to the model, but can be trained from either 1-best or N-best outputs that are normally available via the ASR API. LibriSpeech is used as the primary target domain for adaptation. The generalization ability of the system in two distinct dimensions are then evaluated. First, whether the form of correction model is portable to other speech recognition domains, and secondly whether it can be used for ASR models having a different architecture.

correction model, error correction model, hypothesis, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2023-1899

2306.01208

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Spain > Catalonia > Tarragona Province > Tarragona (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)

Add feedback

N-best T5: Robust ASR Error Correction using Multiple Input Hypotheses and Constrained Decoding Space

Ma, Rao, Gales, Mark J. F., Knill, Kate M., Qian, Mengjie

arXiv.org Artificial IntelligenceOct-10-2023

Error correction models form an important part of Automatic Speech Recognition (ASR) post-processing to improve the readability and quality of transcriptions. Most prior works use the 1-best ASR hypothesis as input and therefore can only perform correction by leveraging the context within one sentence. In this work, we propose a novel N-best T5 model for this task, which is fine-tuned from a T5 model and utilizes ASR N-best lists as model input. By transferring knowledge from the pre-trained language model and obtaining richer information from the ASR decoding space, the proposed approach outperforms a strong Conformer-Transducer baseline. Another issue with standard error correction is that the generation process is not well-guided. To address this a constrained decoding process, either based on the N-best list or an ASR lattice, is used which allows additional information to be propagated.

error correction model, n-best list, speech recognition, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2023-1616

2303.00456

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Can Generative Large Language Models Perform ASR Error Correction?

Ma, Rao, Qian, Mengjie, Manakul, Potsawee, Gales, Mark, Knill, Kate

arXiv.org Artificial IntelligenceSep-29-2023

ASR error correction is an interesting option for post processing speech recognition system outputs. These error correction models are usually trained in a supervised fashion using the decoding results of a target ASR system. This approach can be computationally intensive and the model is tuned to a specific ASR system. Recently generative large language models (LLMs) have been applied to a wide range of natural language processing tasks, as they can operate in a zero-shot or few shot fashion. In this paper we investigate using ChatGPT, a generative LLM, for ASR error correction. Based on the ASR N-best output, we propose both unconstrained and constrained, where a member of the N-best list is selected, approaches. Additionally, zero and 1-shot settings are evaluated. Experiments show that this generative LLM approach can yield performance gains for two different state-of-the-art ASR architectures, transducer and attention-encoder-decoder based, and multiple test sets.

chatgpt, error correction, hypothesis, (13 more...)

arXiv.org Artificial Intelligence

2307.04172

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Spain > Catalonia > Tarragona Province > Tarragona (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Leveraging Denoised Abstract Meaning Representation for Grammatical Error Correction

Cao, Hejing, Zhao, Dongyan

arXiv.org Artificial IntelligenceJul-5-2023

Grammatical Error Correction (GEC) is the task of correcting errorful sentences into grammatically correct, semantically consistent, and coherent sentences. Popular GEC models either use large-scale synthetic corpora or use a large number of human-designed rules. The former is costly to train, while the latter requires quite a lot of human expertise. In recent years, AMR, a semantic representation framework, has been widely used by many natural language tasks due to its completeness and flexibility. A non-negligible concern is that AMRs of grammatically incorrect sentences may not be exactly reliable. In this paper, we propose the AMR-GEC, a seq-to-seq model that incorporates denoised AMR as additional knowledge. Specifically, We design a semantic aggregated GEC model and explore denoising methods to get AMRs more reliable. Experiments on the BEA-2019 shared task and the CoNLL-2014 shared task have shown that AMR-GEC performs comparably to a set of strong baselines with a large number of synthetic data. Compared with the T5 model with synthetic data, AMR-GEC can reduce the training time by 32\% while inference time is comparable. To the best of our knowledge, we are the first to incorporate AMR for grammatical error correction.

computational linguistic, data quality, natural language, (17 more...)

arXiv.org Artificial Intelligence

2307.02127

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Data Science > Data Quality > Data Cleaning (0.86)

Add feedback

Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing

Yan, Hao, Srivastava, Saurabh, Tai, Yintao, Wang, Sida I., Yih, Wen-tau, Yao, Ziyu

arXiv.org Artificial IntelligenceJun-4-2023

Interactive semantic parsing based on natural language (NL) feedback, where users provide feedback to correct the parser mistakes, has emerged as a more practical scenario than the traditional one-shot semantic parsing. However, prior work has heavily relied on human-annotated feedback data to train the interactive semantic parser, which is prohibitively expensive and not scalable. In this work, we propose a new task of simulating NL feedback for interactive semantic parsing. We accompany the task with a novel feedback evaluator. The evaluator is specifically designed to assess the quality of the simulated feedback, based on which we decide the best feedback simulator from our proposed variants. On a text-to-SQL dataset, we show that our feedback simulator can generate high-quality NL feedback to boost the error correction ability of a specific parser. In low-data settings, our feedback simulator can help achieve comparable error correction performance as trained using the costly, full set of human annotations.

artificial intelligence, computational linguistic, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.08195

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > Dominican Republic (0.04)
Asia > Middle East > Iran (0.04)
(10 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Text-to-SQL Error Correction with Language Models of Code

Chen, Ziru, Chen, Shijie, White, Michael, Mooney, Raymond, Payani, Ali, Srinivasa, Jayanth, Su, Yu, Sun, Huan

arXiv.org Artificial IntelligenceMay-28-2023

Despite recent progress in text-to-SQL parsing, current semantic parsers are still not accurate enough for practical use. In this paper, we investigate how to build automatic text-to-SQL error correction models. Noticing that token-level edits are out of context and sometimes ambiguous, we propose building clause-level edit models instead. Besides, while most language models of code are not specifically pre-trained for SQL, they know common data structures and their operations in programming languages such as Python. Thus, we propose a novel representation for SQL queries and their edits that adheres more closely to the pre-training corpora of language models of code. Our error correction model improves the exact set match accuracy of different parsers by 2.4-6.5 and obtains up to 4.3 point absolute improvement over two strong baselines. Our code and data are available at https://github.com/OSU-NLP-Group/Auto-SQL-Correction.

artificial intelligence, computational linguistic, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.13073

Country:

North America > United States > Ohio (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Chinese grammatical error correction based on knowledge distillation

Xia, Peng, Zhou, Yuechi, Zhang, Ziyan, Tang, Zecheng, Li, Juntao

arXiv.org Artificial IntelligenceAug-31-2022

In view of the poor robustness of existing Chinese grammatical error correction models on attack test sets and large model parameters, this paper uses the method of knowledge distillation to compress model parameters and improve the anti-attack ability of the model. In terms of data, the attack test set is constructed by integrating the disturbance into the standard evaluation data set, and the model robustness is evaluated by the attack test set. The experimental results show that the distilled small model can ensure the performance and improve the training speed under the condition of reducing the number of model parameters, and achieve the optimal effect on the attack test set, and the robustness is significantly improved. Code is available at https://github.com/Richard88888/KD-CGEC.

correction, error correction, grammatical error correction, (14 more...)

arXiv.org Artificial Intelligence

2208.00351

Country:

Asia > China > Jiangsu Province (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Illinois (0.04)
Europe > Spain (0.04)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Data Science > Data Quality > Data Cleaning (0.68)

Add feedback