Evaluating Identity Leakage in Speaker De-Identification Systems
Seo, Seungmin, Aulov, Oleg, Godil, Afzal, Mangold, Kevin
–arXiv.org Artificial Intelligence
Speaker de-identification aims to conceal a speaker's identity while preserving intelligibility of the underlying speech. We introduce a benchmark that quantifies residual identity leakage with three complementary error rates: equal error rate, cumulative match characteristic hit rate, and embedding-space similarity measured via canonical correlation analysis and Procrustes analysis. Evaluation results reveal that all state-of-the-art speaker de-identification systems leak identity information. The highest performing system in our evaluation performs only slightly better than random guessing, while the lowest performing system achieves a 45% hit rate within the top 50 candidates based on CMC. These findings highlight persistent privacy risks in current speaker de-identification technologies.
arXiv.org Artificial Intelligence
Aug-20-2025
- Country:
- North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Technology: