Goto

Collaborating Authors

 prestopping


Prestopping: How Does Early Stopping Help Generalization against Label Noise?

arXiv.org Machine Learning

Thus, it is challenging to train a DNN robustly even when noisy labels exist in the training data. A popular approach to dealing with noisy labels is "sample selection" that selects true-labeled samples from the noisy training data (Jiang et al., 2018; Ren et al., 2018; Han et al., 2018; Y u et al., 2019; Song et al., 2019). This loss-based separation is well known to be justified by the memorization effect (Arpit et al., 2017) that DNNs tend to learn easy patterns first and then gradually memorize all samples. Han et al. (2018) empirically proved that training on such small-loss samples yields a much better Despite its great success, Song et al. (2019) have recently argued that the performance of the loss-based separation becomes considerably worse depending on the type of label noise. The memorization rate for false-labeled samples is faster with pair noise than with symmetric noise. Regardless of the noise type, the memorization of false-labeled samples significantly increases at a late stage of training.