Improving Resistance to Noisy Label Fitting by Reweighting Gradient in SAM

Luong, Hoang-Chau, Nguyen-Quang, Thuc, Tran, Minh-Triet

Nov-26-2024–arXiv.org Artificial Intelligence

These authors contributed equally to this work. Noisy labels pose a substantial challenge in machine learning, often resulting in overfitting and poor generalization. Sharpness-Aware Minimization (SAM), as demonstrated by Foret et al. (2021), improves generalization over traditional Stochastic Gradient Descent (SGD) in classification tasks with noisy labels by implicitly slowing noisy learning. While SAM's ability to generalize in noisy environments has been studied in several simplified settings, its full potential in more realistic training settings remains underexplored. In this work, we analyze SAM's behavior at each iteration, identifying specific components of the gradient vector that contribute significantly to its robustness against noisy labels. Based on these insights, we propose SANER (Sharpness-Aware Noise-Explicit Reweighting), an effective variant that enhances SAM's ability to manage noisy fitting rate. Our experiments on CIFAR-10, CIFAR-100, and Mini-WebVision demonstrate that SANER consistently outperforms SAM, achieving up to an 8% increase on CIFAR-100 with 50% label noise. The issue of noisy labels due to human error annotation has been commonly observed in many largescale datasets such as CIFAR-10N, CIFAR-100N (Wei et al., 2022), Clothing1M (Xiao et al., 2015), and WebVision (Li et al., 2017). Over-parameterized deep neural networks, which have enough capacity to memorize entire large datasets, can easily overfit such noisy label data, leading to poor generalization performance (Zhang et al., 2021). Moreover, the lottery ticket hypothesis (Frankle & Carbin, 2019) indicates that only a subset of the network's parameters is crucial for generalization. This highlights the importance of noise-robust learning, where the goal is to train a robust classifier despite the presence of inaccurate or noisy labels in the training dataset. Sharpness-Aware Minimization (SAM), introduced by Foret et al. (2021), is an optimizer designed to find better generalization by searching for flat minima. It has shown superior performance over SGD in various tasks, especially in classification tasks involving noisy labels Baek et al. (2024). Understanding the mechanisms behind the success of SAM is crucial for further improvements in handling label noise.

accuracy, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Nov-26-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Vietnam (0.14)
- North America > Canada
  - Ontario > Toronto (0.14)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.48)
  - Statistical Learning > Gradient Descent (0.54)