VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model

He, Yayun, Kang, Zuheng, Wang, Jianzong, Peng, Junqing, Xiao, Jing

Oct-6-2023–arXiv.org Artificial Intelligence

Speaker verification (SV) performance deteriorates as utterances become shorter. To this end, we propose a new architecture called VoiceExtender which provides a promising solution for improving SV performance when handling short-duration speech signals. We use two guided diffusion models, the built-in and the external speaker embedding (SE) guided diffusion model, both of which utilize a diffusion model-based sample generator that leverages SE guidance to augment the speech features based on a short utterance. Extensive experimental results on the VoxCeleb1 dataset show that our method outperforms the baseline, with relative improvements in equal error rate (EER) of 46.1%, 35.7%, 10.4%, and 5.7% for the short utterance conditions of 0.5, 1.0, 1.5, and 2.0 seconds, respectively.

diffusion model, utterance, voiceextender, (15 more...)

arXiv.org Artificial Intelligence

Oct-6-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Italy
  - Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > China
  - Guangdong Province > Shenzhen (0.04)

Genre:
- Research Report
  - New Finding (0.46)
  - Promising Solution (0.34)

Industry:
- Information Technology > Security & Privacy (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.69)
  - Speech
    - Speech Recognition (0.87)
    - Acoustic Processing (0.87)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found