No Audiogram: Leveraging Existing Scores for Personalized Speech Intelligibility Prediction
Zhou, Haoshuai, Mo, Changgeng, Cao, Boxuan, Li, Linkai, Wang, Shan Xiang
–arXiv.org Artificial Intelligence
Personalized speech intelligibility prediction is challenging. Previous approaches have mainly relied on audiograms, which are inherently limited in accuracy as they only capture a listener's hearing threshold for pure tones. Rather than incorporating additional listener features, we propose a novel approach that leverages an individual's existing intelligibility data to predict their performance on new audio. We introduce the Support Sample-Based Intelligibility Prediction Network (SSIPNet), a deep learning model that leverages speech foundation models to build a high-dimensional representation of a listener's speech recognition ability from multiple support (audio, score) pairs, enabling accurate predictions for unseen audio. Results on the Clarity Prediction Challenge dataset show that, even with a small number of support (audio, score) pairs, our method outperforms audiogram-based predictions. Our work presents a new paradigm for personalized speech intelligibility prediction.
arXiv.org Artificial Intelligence
Jun-4-2025
- Country:
- Asia > China (0.04)
- Europe > France (0.04)
- North America > United States
- California
- San Diego County > San Diego (0.04)
- Santa Clara County > Palo Alto (0.04)
- Florida > Hillsborough County
- University (0.04)
- California
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Health & Medicine > Therapeutic Area (0.97)
- Technology: