Goto

Collaborating Authors

 kad


KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation

arXiv.org Artificial Intelligence

Although being widely adopted for evaluating generated audio signals, the Fr\'echet Audio Distance (FAD) suffers from significant limitations, including reliance on Gaussian assumptions, sensitivity to sample size, and high computational complexity. As an alternative, we introduce the Kernel Audio Distance (KAD), a novel, distribution-free, unbiased, and computationally efficient metric based on Maximum Mean Discrepancy (MMD). Through analysis and empirical validation, we demonstrate KAD's advantages: (1) faster convergence with smaller sample sizes, enabling reliable evaluation with limited data; (2) lower computational cost, with scalable GPU acceleration; and (3) stronger alignment with human perceptual judgments. By leveraging advanced embeddings and characteristic kernels, KAD captures nuanced differences between real and generated audio. Open-sourced in the kadtk toolkit, KAD provides an efficient, reliable, and perceptually aligned benchmark for evaluating generative audio models.


Rawlsian Fairness in Online Bipartite Matching: Two-sided, Group, and Individual

arXiv.org Artificial Intelligence

Online bipartite-matching platforms are ubiquitous and find applications in important areas such as crowdsourcing and ridesharing. In the most general form, the platform consists of three entities: two sides to be matched and a platform operator that decides the matching. The design of algorithms for such platforms has traditionally focused on the operator's (expected) profit. Recent reports have shown that certain demographic groups may receive less favorable treatment under pure profit maximization. As a result, a collection of online matching algorithms have been developed that give a fair treatment guarantee for one side of the market at the expense of a drop in the operator's profit. In this paper, we generalize the existing work to offer fair treatment guarantees to both sides of the market simultaneously, at a calculated worst case drop to operator profit. We consider group and individual Rawlsian fairness criteria. Moreover, our algorithms have theoretical guarantees and have adjustable parameters that can be tuned as desired to balance the trade-off between the utilities of the three sides. We also derive hardness results that give clear upper bounds over the performance of any algorithm.


Lifelong Knowledge Learning in Rule-based Dialogue Systems

arXiv.org Artificial Intelligence

One of the main weaknesses of current chatbots or dialogue systems is that they do not learn online during conversations after they are deployed. This is a major loss of opportunity. Clearly, each human user has a great deal of knowledge about the world that may be useful to others. If a chatbot can learn from their users during chatting, it will greatly expand its knowledge base and serve its users better. This paper proposes to build such a learning capability in a rule-based chatbot so that it can continuously acquire new knowledge in its chatting with users. This work is useful because many real-life deployed chatbots are rule-based.