KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation

Chung, Yoonjin, Eu, Pilsun, Lee, Junwon, Choi, Keunwoo, Nam, Juhan, Chon, Ben Sangbae

Mar-9-2025–arXiv.org Artificial Intelligence

Although being widely adopted for evaluating generated audio signals, the Fr\'echet Audio Distance (FAD) suffers from significant limitations, including reliance on Gaussian assumptions, sensitivity to sample size, and high computational complexity. As an alternative, we introduce the Kernel Audio Distance (KAD), a novel, distribution-free, unbiased, and computationally efficient metric based on Maximum Mean Discrepancy (MMD). Through analysis and empirical validation, we demonstrate KAD's advantages: (1) faster convergence with smaller sample sizes, enabling reliable evaluation with limited data; (2) lower computational cost, with scalable GPU acceleration; and (3) stronger alignment with human perceptual judgments. By leveraging advanced embeddings and characteristic kernels, KAD captures nuanced differences between real and generated audio. Open-sourced in the kadtk toolkit, KAD provides an efficient, reliable, and perceptually aligned benchmark for evaluating generative audio models.

arxiv preprint arxiv, fad, kad, (15 more...)

arXiv.org Artificial Intelligence

Mar-9-2025

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- North America > United States (0.04)
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe
  - United Kingdom > England
    - Surrey > Guildford (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
  - Austria > Upper Austria
    - Linz (0.04)
- Asia
  - Bangladesh (0.04)
  - South Korea
    - Seoul > Seoul (0.05)
    - Incheon > Incheon (0.04)
    - Gwangju > Gwangju (0.04)
  - Singapore > Central Region
    - Singapore (0.04)
  - Japan > Honshū
    - Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
  - China
    - Beijing > Beijing (0.05)
    - Shanghai > Shanghai (0.04)
    - Shaanxi Province > Xi'an (0.04)
    - Heilongjiang Province > Harbin (0.04)

Genre:
- Research Report
  - Experimental Study (0.93)
  - New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found