Taming Hyperparameter Sensitivity in Data Attribution: Practical Selection Without Costly Retraining
–Neural Information Processing Systems
Data attribution methods, which quantify the influence of individual training data points on a machine learning model, have gained increasing popularity in data-centric applications in modern AI. Despite a recent surge of new methods developed in this space, the impact of hyperparameter tuning in these methods remains under-explored. In this work, we present the first large-scale empirical study to understand the hyperparameter sensitivity of common data attribution methods. Our results show that most methods are indeed sensitive to certain key hyperparameters. However, unlike typical machine learning algorithms---whose hyperparameters can be tuned using computationally-cheap validation metrics---evaluating data attribution performance often requires retraining models on subsets of training data, making such metrics prohibitively costly for hyperparameter tuning.
Neural Information Processing Systems
Jun-14-2026, 07:50:47 GMT