Set a Thief to Catch a Thief: Combating Label Noise through Noisy Meta Learning

Wang, Hanxuan, Lu, Na, Zhao, Xueying, Yan, Yuxuan, Ma, Kaipeng, Keong, Kwoh Chee, Carneiro, Gustavo

arXiv.org Artificial Intelligence 

X, SEPTEMBER XXXX 1 Set a Thief to Catch a Thief: Combating Label Noise through Noisy Meta Learning Hanxuan Wang, Na Lu, Xueying Zhao, Y uxuan Y an, Kaipeng Ma, Kwoh Chee Keong, Gustavo Carneiro Abstract --Learning from noisy labels (LNL) aims to train high-performance deep models using noisy datasets. Meta learning based label correction methods have demonstrated remarkable performance in LNL by designing various meta label rectification tasks. However, extra clean validation set is a prerequisite for these methods to perform label correction, requiring extra labor and greatly limiting their practicality. T o tackle this issue, we propose a novel noisy meta label correction framework STCT, which counterintuitively uses noisy data to correct label noise, borrowing the spirit in the saying " S et a T hief to C atch a T hief". The core idea of STCT is to leverage noisy data which is i.i.d. with the training data as a validation set to evaluate model performance and perform label correction in a meta learning framework, eliminating the need for extra clean data. By decoupling the complex bi-level optimization in meta learning into representation learning and label correction, STCT is solved through an alternating training strategy between noisy meta correction and semi-supervised representation learning. Extensive experiments on synthetic and real-world datasets demonstrate the outstanding performance of STCT, particularly in high noise rate scenarios. STCT achieves 96.9% label correction and 95.2% classification performance on CIF AR-10 with 80% symmetric noise, significantly surpassing the current state-of-the-art. I NTRODUCTION D EEP Deep learning has achieved great success in various fields, attributed to the availability of carefully annotated large scale datasets [1], [2], [3]. However, the collection of high quality datasets generally comes with high annotation cost and intensive human intervention, creating a significant obstacle to the development of deep learning. Fortunately, the annotation cost issue can be mitigated through web crawling [4] and crowdsourcing [5]. However, such low-cost datasets often contain a considerable amount of noisy labels, which may lead to severe overfitting of neural networks and performance degradation [6].