Pairwise Evaluation of Accent Similarity in Speech Synthesis

Zhong, Jinzuomu, Liu, Suyuan, Wells, Dan, Richmond, Korin

May-21-2025–arXiv.org Artificial Intelligence

Despite growing interest in generating high-fidelity accents, evaluating accent similarity in speech synthesis has been un-derexplored. We aim to enhance both subjective and objective evaluation methods for accent similarity. Subjectively, we refine the XAB listening test by adding components that achieve higher statistical significance with fewer listeners and lower costs. Our method involves providing listeners with transcriptions, having them highlight perceived accent differences, and implementing meticulous screening for reliability. Objectively, we utilise pronunciation-related metrics, based on distances between vowel formants and phonetic posteriorgrams, to evaluate accent generation. Comparative experiments reveal that these metrics, alongside accent similarity, speaker similarity, and Mel Cepstral Distortion, can be used. Moreover, our findings underscore significant limitations of common metrics like Word Error Rate in assessing underrepresented accents.

artificial intelligence, machine learning, similarity, (16 more...)

arXiv.org Artificial Intelligence

May-21-2025

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England (0.28)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Synthesis (0.88)
  - Machine Learning
    - Neural Networks > Deep Learning (0.46)
    - Performance Analysis > Accuracy (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found