Subclass-Dominant Label Noise: A Counterexample for the Success of Early Stopping

Jan-19-2025, 23:39:37 GMT–Neural Information Processing Systems

In this paper, we empirically investigate a previously overlooked and widespread type of label noise, subclass-dominant label noise (SDN). Our findings reveal that, during the early stages of training, deep neural networks can rapidly memorize mislabeled examples in SDN. This phenomenon poses challenges in effectively selecting confident examples using conventional early stopping techniques. To address this issue, we delve into the properties of SDN and observe that long-trained representations are superior at capturing the high-level semantics of mislabeled examples, leading to a clustering effect where similar examples are grouped together. Based on this observation, we propose a novel method called NoiseCluster that leverages the geometric structures of long-trained representations to identify and correct SDN.

counterexample, early stopping, subclass-dominant label noise, (2 more...)

Neural Information Processing Systems

Jan-19-2025, 23:39:37 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)