Multilingual Dataset Integration Strategies for Robust Audio Deepfake Detection: A SAFE Challenge System

Ali, Hashim, Subramani, Surya, Bollinani, Lekha, Adupa, Nithin Sai, El-Loh, Sali, Malik, Hafiz

Oct-8-2025–arXiv.org Artificial Intelligence

The SAFE Challenge evaluates synthetic speech detection across three tasks: unmodified audio, processed audio with compression artifacts, and laundered audio designed to evade detection. We systematically explore self-supervised learning (SSL) front-ends, training data compositions, and audio length configurations for robust deepfake detection. Our AASIST-based approach incorporates WavLM large frontend with RawBoost augmentation, trained on a multilingual dataset of 256,600 samples spanning 9 languages and over 70 TTS systems from CodecFake, MLAAD v5, SpoofCeleb, Famous Figures, and MAILABS. Through extensive experimentation with different SSL front-ends, three training data versions, and two audio lengths, we achieved second place in both Task 1 (unmodified audio detection) and Task 3 (laundered audio detection), demonstrating strong generalization and robustness.

artificial intelligence, dataset, machine learning, (13 more...)

arXiv.org Artificial Intelligence

Oct-8-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Beijing > Beijing (0.04)
- Europe
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
  - United Kingdom (0.14)
- North America > United States
  - Michigan > Wayne County > Dearborn (0.04)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)