NTU-NPU System for Voice Privacy 2024 Challenge
Kuzmin, Nikita, Luong, Hieu-Thi, Yao, Jixun, Xie, Lei, Lee, Kong Aik, Chng, Eng Siong
–arXiv.org Artificial Intelligence
B3 The baseline system B3 uses a Wasserstein generative adversarial In this work, we describe our submissions for the Voice Privacy network with Quadratic Transport Cost (WGAN-QC) [6] Challenge 2024. Rather than proposing a novel speech to generate artificial pseudo-speaker embeddings, anonymizing anonymization system, we enhance the provided baselines to the speaker's identity through four main steps: meet all required conditions and improve evaluated metrics. Specifically, we implement emotion embedding and experiment 1. Phonetic Transcriptions Extraction: Phonetic transcriptions with WavLM and ECAPA2 speaker embedders for the B3 baseline.
arXiv.org Artificial Intelligence
Oct-3-2024
- Genre:
- Research Report > New Finding (0.68)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks (0.69)
- Natural Language (0.87)
- Speech > Speech Recognition (0.47)
- Information Technology > Artificial Intelligence