schrab
Regularized $f$-Divergence Kernel Tests
Ribero, Mónica, Schrab, Antonin, Gretton, Arthur
We propose a framework to construct practical kernel-based two-sample tests from the family of $f$-divergences. The test statistic is computed from the witness function of a regularized variational representation of the divergence, which we estimate using kernel methods. The proposed test is adaptive over hyperparameters such as the kernel bandwidth and the regularization parameter. We provide theoretical guarantees for statistical test power across our family of $f$-divergence estimates. While our test covers a variety of $f$-divergences, we bring particular focus to the Hockey-Stick divergence, motivated by its applications to differential privacy auditing and machine unlearning evaluation. For two-sample testing, experiments demonstrate that different $f$-divergences are sensitive to different localized differences, illustrating the importance of leveraging diverse statistics. For machine unlearning, we propose a relative test that distinguishes true unlearning failures from safe distributional variations.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Africa > Middle East > Morocco > Tanger-Tetouan-Al Hoceima Region > Tangier (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)
A Unified View of Optimal Kernel Hypothesis Testing
This paper provides a unifying view of optimal kernel hypothesis testing across the MMD two-sample, HSIC independence, and KSD goodness-of-fit frameworks. Minimax optimal separation rates in the kernel and $L^2$ metrics are presented, with two adaptive kernel selection methods (kernel pooling and aggregation), and under various testing constraints: computational efficiency, differential privacy, and robustness to data corruption. Intuition behind the derivation of the power results is provided in a unified way accross the three frameworks, and open problems are highlighted.
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- Asia > Middle East > Jordan (0.04)
A Practical Introduction to Kernel Discrepancies: MMD, HSIC & KSD
This article provides a practical introduction to kernel discrepancies, focusing on the Maximum Mean Discrepancy (MMD), the Hilbert-Schmidt Independence Criterion (HSIC), and the Kernel Stein Discrepancy (KSD). Various estimators for these discrepancies are presented, including the commonly-used V-statistics and U-statistics, as well as several forms of the more computationallyefficient incomplete U-statistics. The importance of the choice of kernel bandwidth is stressed, showing how it affects the behaviour of the discrepancy estimation. Adaptive estimators are introduced, which combine multiple estimators with various kernels, addressing the problem of kernel selection. This paper corresponds to the introduction of my PhD thesis (Schrab, 2025a, Chapter 2) and is presented as a standalone article to introduce the reader to kernel discrepancies estimators. First, in Section 1, we define kernels, Reproducing Kernel Hilbert Spaces, mean embeddings and cross-covariance operators, and present kernel properties such as characteristicity, universality and translation invariance. Then, in Section 2, we introduce the Maximum Mean Discprecancy, the Hilbert-Schmidt Independence Criterion, and the Kernel Stein Discrepancy, as well as their estimators, and we discuss the importance of the choice of kernel for such measures. We then introduce a collection of statistics in Section 3, including the commonly-used complete statistics, as well as their incomplete counterparts which trade accuracy for computational efficiency. Finally, in Section 4, we construct adaptive estimators combining multiple statistics with various kernels, which is one method to address the problem of kernel selection.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California (0.04)
- North America > Canada > Quebec > Montreal (0.04)
MMD Aggregated Two-Sample Test
Schrab, Antonin, Kim, Ilmun, Albert, Mélisande, Laurent, Béatrice, Guedj, Benjamin, Gretton, Arthur
We propose a novel nonparametric two-sample test based on the Maximum Mean Discrepancy (MMD), which is constructed by aggregating tests with different kernel bandwidths. This aggregation procedure, called MMDAgg, ensures that test power is maximised over the collection of kernels used, without requiring held-out data for kernel selection (which results in a loss of test power), or arbitrary kernel choices such as the median heuristic. We work in the non-asymptotic framework, and prove that our aggregated test is minimax adaptive over Sobolev balls. Our guarantees are not restricted to a specific kernel, but hold for any product of one-dimensional translation invariant characteristic kernels which are absolutely and square integrable. Moreover, our results apply for popular numerical procedures to determine the test threshold, namely permutations and the wild bootstrap. Through numerical experiments on both synthetic and real-world datasets, we demonstrate that MMDAgg outperforms alternative state-of-the-art approaches to MMD kernel adaptation for two-sample testing.
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Europe > United Kingdom (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)