Impact of Phonetics on Speaker Identity in Adversarial Voice Attack

Dar, Daniyal Kabir, Yan, Qiben, Xiao, Li, Ross, Arun

Sep-22-2025–arXiv.org Artificial Intelligence

Abstract--Adversarial perturbations in speech pose a serious threat to automatic speech recognition (ASR) and speaker verification by introducing subtle waveform modifications that remain imperceptible to humans but can significantly alter system outputs. While targeted attacks on end-to-end ASR models have been widely studied, the phonetic basis of these perturbations and their effect on speaker identity remain underexplored. In this work, we analyze adversarial audio at the phonetic level and show that perturbations are associated with systematic phonetic tendencies, such as vowel centralization and consonant substitutions. Using the DeepSpeech ASR model as our target, we generate targeted adversarial examples and evaluate their impact on speaker identity embeddings across genuine and impostor samples. Results across 16 phonetically diverse target phrases demonstrate that adversarial audio induces both transcription errors and identity drift, highlighting the need for phonetic-aware defenses to ensure the robustness of ASR and speaker recognition systems.

artificial intelligence, identity drift, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Sep-22-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found