Listening, Imagining & Refining: A Heuristic Optimized ASR Correction Framework with LLMs

Liu, Yutong, Zhang, Ziyue, Huang, Cheng, Yu, Yongbin, Wang, Xiangxiang, Cai, Yuqing, Tashi, Nyima

Sep-23-2025–arXiv.org Artificial Intelligence

ABSTRACT Automatic Speech Recognition (ASR) systems remain prone to errors that affect downstream applications. In this paper, we propose LIR-ASR, a heuristic optimized iterative correction framework using LLMs, inspired by human auditory perception. LIR-ASR applies a "Listening-Imagining-Refining" strategy, generating phonetic variants and refining them in context. A heuristic optimization with finite state machine (FSM) is introduced to prevent the correction process from being trapped in local optima and rule-based constraints help maintain semantic fidelity. Experiments on both English and Chinese ASR outputs show that LIR-ASR achieves average reductions in CER/WER of up to 1.5 percentage points compared to baselines, demonstrating substantial accuracy gains in transcription.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Sep-23-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.69)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found