Yu, Xingrui
Self-Adaptive Gamma Context-Aware SSM-based Model for Metal Defect Detection
Sun, Sijin, Deng, Ming, Yu, Xingrui, Xi, Xinyu, Zhao, Liangbin
The quality of metal surfaces is critical in various industrial applications, including aerospace, manufacturing, and container transportation. Surface defects, such as cracks, dents, and scratches, not only compromise the structural integrity and aesthetics of metal products but also lead to significant economic losses if left undetected. As a result, the accurate and efficient detection of metal surface defects has become an essential task in industrial quality control. In recent years, the adoption of deep learning techniques has significantly advanced the performance of defect detection systems [1]. Convolutional neural networks (CNNs) and transformer-based models have demonstrated exceptional capabilities in handling complex image-based tasks, enabling automated and reliable defect detection. However, several challenges remain: 1) Metal defect often exhibits varied and localized features, making effective multi-scale feature aggregation vital for improving detection accuracy.
Imitation from Diverse Behaviors: Wasserstein Quality Diversity Imitation Learning with Single-Step Archive Exploration
Yu, Xingrui, Wan, Zhenglin, Bossens, David Mark, Lyu, Yueming, Guo, Qing, Tsang, Ivor W.
Learning diverse and high-performance behaviors from a limited set of demonstrations is a grand challenge. Traditional imitation learning methods usually fail in this task because most of them are designed to learn one specific behavior even with multiple demonstrations. Therefore, novel techniques for quality diversity imitation learning are needed to solve the above challenge. This work introduces Wasserstein Quality Diversity Imitation Learning (WQDIL), which 1) improves the stability of imitation learning in the quality diversity setting with latent adversarial training based on a Wasserstein Auto-Encoder (WAE), and 2) mitigates a behavior-overfitting issue using a measure-conditioned reward function with a single-step archive exploration bonus. Empirically, our method significantly outperforms state-of-the-art IL methods, achieving near-expert or beyond-expert QD performance on the challenging continuous control tasks derived from MuJoCo environments.
Quality Diversity Imitation Learning
Wan, Zhenglin, Yu, Xingrui, Bossens, David Mark, Lyu, Yueming, Guo, Qing, Fan, Flint Xiaofeng, Tsang, Ivor
Imitation learning (IL) has shown great potential in various applications, such as robot control. However, traditional IL methods are usually designed to learn only one specific type of behavior since demonstrations typically correspond to a single expert. In this work, we introduce the first generic framework for Quality Diversity Imitation Learning (QD-IL), which enables the agent to learn a broad range of skills from limited demonstrations. Our framework integrates the principles of quality diversity with adversarial imitation learning (AIL) methods, and can potentially improve any inverse reinforcement learning (IRL) method. Empirically, our framework significantly improves the QD performance of GAIL and VAIL on the challenging continuous control tasks derived from Mujoco environments. Moreover, our method even achieves 2x expert performance in the most challenging Humanoid environment.
Intrinsic Reward Driven Imitation Learning via Generative Model
Yu, Xingrui, Lyu, Yueming, Tsang, Ivor W.
Imitation learning in a high-dimensional environment is challenging. Most inverse reinforcement learning (IRL) methods fail to outperform the demonstrator in such a high-dimensional environment, e.g., Atari domain. To address this challenge, we propose a novel reward learning module to generate intrinsic reward signals via a generative model. Our generative method can perform better forward state transition and backward action encoding, which improves the module's dynamics modeling ability in the environment. Thus, our module provides the imitation agent both the intrinsic intention of the demonstrator and a better exploration ability, which is critical for the agent to outperform the demonstrator. Empirical results show that our method outperforms state-of-the-art IRL methods on multiple Atari games, even with one-life demonstration. Remarkably, our method achieves performance that is up to 5 times the performance of the demonstration.
Co-teaching: Robust training of deep neural networks with extremely noisy labels
Han, Bo, Yao, Quanming, Yu, Xingrui, Niu, Gang, Xu, Miao, Hu, Weihua, Tsang, Ivor, Sugiyama, Masashi
Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called ''Co-teaching'' for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.
How does Disagreement Help Generalization against Label Corruption?
Yu, Xingrui, Han, Bo, Yao, Jiangchao, Niu, Gang, Tsang, Ivor W., Sugiyama, Masashi
Learning with noisy labels is one of the hottest problems in weakly-supervised learning. Based on memorization effects of deep neural networks, training on small-loss instances becomes very promising for handling noisy labels. This fosters the state-of-the-art approach "Co-teaching" that cross-trains two deep neural networks using the small-loss trick. However, with the increase of epochs, two networks converge to a consensus and Co-teaching reduces to the self-training MentorNet. To tackle this issue, we propose a robust learning paradigm called Co-teaching+, which bridges the "Update by Disagreement" strategy with the original Co-teaching. First, two networks feed forward and predict all data, but keep prediction disagreement data only. Then, among such disagreement data, each network selects its small-loss data, but back propagates the small-loss data from its peer network and updates its own parameters. Empirical results on benchmark datasets demonstrate that Co-teaching+ is much superior to many state-of-the-art methods in the robustness of trained models.
Co-teaching: Robust training of deep neural networks with extremely noisy labels
Han, Bo, Yao, Quanming, Yu, Xingrui, Niu, Gang, Xu, Miao, Hu, Weihua, Tsang, Ivor, Sugiyama, Masashi
Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called ''Co-teaching'' for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.
Co-teaching: Robust training of deep neural networks with extremely noisy labels
Han, Bo, Yao, Quanming, Yu, Xingrui, Niu, Gang, Xu, Miao, Hu, Weihua, Tsang, Ivor, Sugiyama, Masashi
Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called ''Co-teaching'' for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.
Pumpout: A Meta Approach for Robustly Training Deep Neural Networks with Noisy Labels
Han, Bo, Niu, Gang, Yao, Jiangchao, Yu, Xingrui, Xu, Miao, Tsang, Ivor, Sugiyama, Masashi
It is challenging to train deep neural networks robustly on the industrial-level data, since labels of such data are heavily noisy, and their label generation processes are normally agnostic. To handle these issues, by using the memorization effects of deep neural networks, we may train deep neural networks on the whole dataset only the first few iterations. Then, we may employ early stopping or the small-loss trick to train them on selected instances. However, in such training procedures, deep neural networks inevitably memorize some noisy labels, which will degrade their generalization. In this paper, we propose a meta algorithm called Pumpout to overcome the problem of memorizing noisy labels. By using scaled stochastic gradient ascent, Pumpout actively squeezes out the negative effects of noisy labels from the training model, instead of passively forgetting these effects. We leverage Pumpout to upgrade two representative methods: MentorNet and Backward Correction. Empirical results on benchmark datasets demonstrate that Pumpout can significantly improve the robustness of representative methods.
Co-teaching: Robust Training Deep Neural Networks with Extremely Noisy Labels
Han, Bo, Yao, Quanming, Yu, Xingrui, Niu, Gang, Xu, Miao, Hu, Weihua, Tsang, Ivor, Sugiyama, Masashi
It is challenging to train deep neural networks robustly with noisy labels, as the capacity of deep neural networks is so high that they can totally over-fit on these noisy labels. In this paper, motivated by the memorization effects of deep networks, which shows networks fit clean instances first and then noisy ones, we present a new paradigm called "\textit{Co-teaching}" combating with noisy labels. We train two networks simultaneously. First, in each mini-batch data, each network filters noisy instances based on memorization effects. Then, it teaches the remained instances to its peer network for updating the parameters. Empirical results on benchmark datasets demonstrate that, the robustness of deep learning models trained by Co-teaching approach is much superior than that of state-of-the-art methods.