Banff
075b051ec3d22dac7b33f788da631fd4-Paper.pdf
We investigate whether post-hoc model explanations are effective for diagnosing model errors-model debugging. In response to the challenge of explaining a model's prediction, a vast array of explanation methods have been proposed. Despite increasing use, it is unclear if they are effective. To start, we categorizebugs,based on their source, into: data, model, and test-timecontamination bugs.
Principles of Lipschitz continuity in neural networks
Deep learning has achieved remarkable success across a wide range of domains, significantly expanding the frontiers of what is achievable in artificial intelligence. Yet, despite these advances, critical challenges remain -- most notably, ensuring robustness to small input perturbations and generalization to out-of-distribution data. These critical challenges underscore the need to understand the underlying fundamental principles that govern robustness and generalization. Among the theoretical tools available, Lipschitz continuity plays a pivotal role in governing the fundamental properties of neural networks related to robustness and generalization. It quantifies the worst-case sensitivity of network's outputs to small input perturbations. While its importance is widely acknowledged, prior research has predominantly focused on empirical regularization approaches based on Lipschitz constraints, leaving the underlying principles less explored. This thesis seeks to advance a principled understanding of the principles of Lipschitz continuity in neural networks within the paradigm of machine learning, examined from two complementary perspectives: an internal perspective -- focusing on the temporal evolution of Lipschitz continuity in neural networks during training (i.e., training dynamics); and an external perspective -- investigating how Lipschitz continuity modulates the behavior of neural networks with respect to features in the input data, particularly its role in governing frequency signal propagation (i.e., modulation of frequency signal propagation).
FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection
Cui, Jin, Zhao, Boran, Xu, Jiajun, Guo, Jiaqi, Guan, Shuo, Ren, Pengju
Existing methods are either: (i) DNN-based, which are inherently coupled with network-specific parameters, inevitably introducing architectural bias and compromising generalization; or (ii) DNN-free, which utilize heuristics that lack rigorous theoretical guarantees for stability and accuracy. Neither approach explicitly constrains distributional equivalence of the representative subsets, largely because continuous distribution matching is broadly considered inapplicable to discrete dataset sampling. Furthermore, prevalent distribution metrics (e.g., MSE, KL, MMD, and CE) are often incapable of accurately capturing higher-order moments differences. These deficiencies lead to suboptimal coreset performance, preventing the selected coreset from being truly equivalent to the original dataset. W e propose F AST (Frequency-domain Aligned Sampling via T opology), the first DNN-free distribution-matching coreset selection framework that formulates coreset selection task as a graph-constrained optimization problem grounded in spectral graph theory and employs the Characteristic Function Distance (CFD) to capture full distributional information (i.e., all moments and intrinsic correlations) in the frequency domain. W e further discover that naive CFD suffers from a "vanishing phase gradient" issue in medium and high-frequency regions; to address this, we introduce an Attenuated Phase-Decoupled CFD.