No Audiogram: Leveraging Existing Scores for Personalized Speech Intelligibility Prediction

Zhou, Haoshuai, Mo, Changgeng, Cao, Boxuan, Li, Linkai, Wang, Shan Xiang

Jun-4-2025–arXiv.org Artificial Intelligence

Personalized speech intelligibility prediction is challenging. Previous approaches have mainly relied on audiograms, which are inherently limited in accuracy as they only capture a listener's hearing threshold for pure tones. Rather than incorporating additional listener features, we propose a novel approach that leverages an individual's existing intelligibility data to predict their performance on new audio. We introduce the Support Sample-Based Intelligibility Prediction Network (SSIPNet), a deep learning model that leverages speech foundation models to build a high-dimensional representation of a listener's speech recognition ability from multiple support (audio, score) pairs, enabling accurate predictions for unseen audio. Results on the Clarity Prediction Challenge dataset show that, even with a small number of support (audio, score) pairs, our method outperforms audiogram-based predictions. Our work presents a new paradigm for personalized speech intelligibility prediction.

artificial intelligence, intelligibility score, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Jun-4-2025

arXiv.org PDF

Add feedback

Country:
- Europe > France (0.04)
- Asia > China (0.04)
- North America > United States
  - Florida > Hillsborough County
    - University (0.04)
  - California
    - Santa Clara County > Palo Alto (0.04)
    - San Diego County > San Diego (0.04)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Health & Medicine > Therapeutic Area (0.97)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Statistical Learning (0.68)
    - Neural Networks > Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found