H-PRM: A Pluggable Hotword Pre-Retrieval Module for Various Speech Recognition Systems
Dai, Huangyu, Mao, Lingtao, Chen, Ben, Wang, Zihan, Liang, Zihan, Han, Ying, Lei, Chenyi, Li, Han
–arXiv.org Artificial Intelligence
Hotword customization is crucial in ASR to enhance the accuracy of domain-specific terms. It has been primarily driven by the advancements in traditional models and Audio large language models (LLMs). However, existing models often struggle with large-scale hotwords, as the recognition rate drops dramatically with the number of hotwords increasing. In this paper, we introduce a novel hotword customization system that utilizes a hotword pre-retrieval module (H-PRM) to identify the most relevant hotword candidate by measuring the acoustic similarity between the hotwords and the speech segment. This plug-and-play solution can be easily integrated into traditional models such as SeACo-Paraformer, significantly enhancing hotwords post-recall rate (PRR). Additionally, we incorporate H-PRM into Audio LLMs through a prompt-based approach, enabling seamless customization of hotwords. Extensive testing validates that H-PRM can outperform existing methods, showing a new direction for hotword customization in ASR.
arXiv.org Artificial Intelligence
Nov-25-2025
- Country:
- Asia
- China
- Beijing > Beijing (0.04)
- Zhejiang Province > Hangzhou (0.06)
- South Korea > Seoul
- Seoul (0.05)
- China
- North America > United States
- New York > New York County > New York City (0.04)
- Asia
- Genre:
- Research Report (0.82)
- Technology: