Goto

Collaborating Authors

 trackin



Review for NeurIPS paper: Estimating Training Data Influence by Tracing Gradient Descent

Neural Information Processing Systems

Weaknesses: I have some major concerns with the evaluation part of the paper. A simple baseline could be a loss based selection method. Simply select training points based on loss change. A recent paper [DataLens IJCNN 20] shows that a simple loss based selection outperforms both influence functions and representer selection on mislabelled data identification when the mislabeled data is small. As the fraction of mislabelled data increases, influence function works better than loss based method.


Estimating Training Data Influence by Tracking Gradient Descent

Pruthi, Garima, Liu, Frederick, Sundararajan, Mukund, Kale, Satyen

arXiv.org Machine Learning

We introduce a method called TrackIn that computes the influence of a training example on a prediction made by the model, by tracking how the loss on the test point changes during the training process whenever the training example of interest was utilized. We provide a scalable implementation of TrackIn via a combination of a few key ideas: (a) a first-order approximation to the exact computation, (b) using random projections to speed up the computation of the first-order approximation for large models, (c) using saved checkpoints of standard training procedures, and (d) cherry-picking layers of a deep neural network. An experimental evaluation shows that TrackIn is more effective in identifying mislabelled training examples than other related methods such as influence functions and representer points. We also discuss insights from applying the method on vision, regression and natural language tasks.