Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions
Chhabra, Anshuman, Li, Bo, Chen, Jian, Mohapatra, Prasant, Liu, Hongfu
–arXiv.org Artificial Intelligence
Data-centric learning focuses on enhancing algorithmic performance from the perspective of the training data [Oala et al., 2023]. In contrast to model-centric learning, which designs novel algorithms or optimization techniques for performance improvement with fixed training data, data-centric learning operates with a fixed learning algorithm while modifying the training data through trimming, augmenting, or other methods aligned with improving utility [Zha et al., 2023]. Data-centric learning holds significant potential in many areas such as model interpretation, subset training set selection, data generation, noisy label detection, active learning, and others [Chhabra et al., 2024, Kwon et al., 2024]. The essence of data-centric learning lies in estimating data influence, also known as data valuation [Hammoudeh and Lowd, 2022], in the context of a learning task. Intuitively, the impact of an individual data sample can be measured by assessing the change in learning utility when training with and without that specific sample. This leave-one-out influence [Cook and Weisberg, 1982] provides a rough gauge of the relative data influence of the specific sample on the otherwise full fixed training set. On the other hand, Shapley value [Ghorbani and Zou, 2019, Jia et al., 2019], originating from cooperative game theory, quantifies the increase in value when a group of samples collaborates to achieve the learning goal. Unlike leave-one-out influence, Shapley value represents the weighted average utility change resulting from adding the point to different training subsets. Despite the absence of assumptions on the learning model, the aforementioned retraining-based methods incur significant computational costs, especially for large-scale data analysis and deep models [Hammoudeh and Lowd, 2022].
arXiv.org Artificial Intelligence
May-12-2024
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > France (0.04)
- North America
- Oceania > New Zealand
- North Island > Waikato (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Leisure & Entertainment > Games (0.34)
- Technology: