AITopics | self-adaptive training

Collaborating Authors

self-adaptive training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Experimental Setups A.1 Double descent phenomenon Following previous work [

Neural Information Processing SystemsAug-16-2025, 22:21:11 GMT

Accuracy curves of model trained using ERM. Figure 7: Accuracy curves of model trained on noisy CIFAR10 training set with 80% noise rate. For training, we use initial learning rate of 0.1, batch size of 128, 100 training epochs. We split the training set into two portions: 1) Untouched portion, i.e., the elements in the training set which were left untouched; 2) Corrupted portion, i.e., the elements in The learning rate is linearly increased from 0.0003 Following common practice, we use random resizing, cropping and flipping augmentation during training. However, they only analyzed the generalization errors in the presence of corrupted labels. This occurs around the epochs between underfitting and overfitting.

artificial intelligence, international conference, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

e0ab531ec312161511493b002f9be2ee-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 22:21:04 GMT

artificial intelligence, machine learning, model prediction, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > China (0.04)
(2 more...)

Industry: Government (0.46)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Review for NeurIPS paper: Self-Adaptive Training: beyond Empirical Risk Minimization

Neural Information Processing SystemsFeb-7-2025, 07:19:35 GMT

Weaknesses: The main weakness of the proposed approached is that it is not supported by any theoretical insight. In particular, the success of the method hinges on the premise that the model is able to guess the right predictions so as to correct the noisy labels. Since there is no theoretical criterion to verify that premise, it is not possible to predict whether this proposed method will work well on new learning tasks. Going further, one can imagine cases where this method would fail and actually perform worse than ERM. For instance, if the model is unable to capture sufficient information from the data distribution (for instance if the data distribution is very complex and / or if there are too few training samples and / or if the model does not have sufficient capacity), it would be impossible for the model to "bootstrap" its own predictions and guess the correct labels.

empirical risk minimization, neurips paper, self-adaptive training, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: Self-Adaptive Training: beyond Empirical Risk Minimization

Neural Information Processing SystemsFeb-7-2025, 07:19:29 GMT

The paper focuses on the problem of learning from corrupted data (e.g. This objective can be interpreted as a self-training whereby the model's predictions are progressively averaged with the true (and possibly noisy labels) coupled with a sample weighting scheme which improves training stability. The authors show that this approach can be used for a variety of vision tasks, including classification under label noise, adversarial training, and selective classification. The reviewers appreciated the conceptual simplicity of the method, the clarity of the presentation, and the promising empirical results. The discussion phase focused on the following two drawbacks: - Theoretical justification: While the theoretical analysis is hard for the general case, it might be doable in the corrupted linear regression case, which could offer some valuable insights.

empirical risk minimization, neurips paper, self-adaptive training, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Self-Adaptive Training: beyond Empirical Risk Minimization

Neural Information Processing SystemsOct-11-2024, 14:24:52 GMT

We propose self-adaptive training---a new training algorithm that dynamically calibrates training process by model predictions without incurring extra computational cost---to improve generalization of deep learning for potentially corrupted training data. This problem is important to robustly learning from data that are corrupted by, e.g., random noises and adversarial examples. The standard empirical risk minimization (ERM) for such data, however, may easily overfit noises and thus suffers from sub-optimal performance. In this paper, we observe that model predictions can substantially benefit the training process: self-adaptive training significantly mitigates the overfitting issue and improves generalization over ERM under both random and adversarial noises. Besides, in sharp contrast to the recently-discovered double-descent phenomenon in ERM, self-adaptive training exhibits a single-descent error-capacity curve, indicating that such a phenomenon might be a result of overfitting of noises.

empirical risk minimization, model prediction, self-adaptive training, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Training Private Models That Know What They Don't Know

Rabanser, Stephan, Thudi, Anvith, Thakurta, Abhradeep, Dvijotham, Krishnamurthy, Papernot, Nicolas

arXiv.org Artificial IntelligenceMay-28-2023

Training reliable deep learning models which avoid making overconfident but incorrect predictions is a longstanding challenge. This challenge is further exacerbated when learning has to be differentially private: protection provided to sensitive data comes at the price of injecting additional randomness into the learning process. In this work, we conduct a thorough empirical investigation of selective classifiers -- that can abstain when they are unsure -- under a differential privacy constraint. We find that several popular selective prediction approaches are ineffective in a differentially private setting as they increase the risk of privacy leakage. At the same time, we identify that a recent approach that only uses checkpoints produced by an off-the-shelf private learning algorithm stands out as particularly suitable under DP. Further, we show that differential privacy does not just harm utility but also degrades selective classification performance. To analyze this effect across privacy levels, we propose a novel evaluation mechanism which isolate selective prediction performance across model utility levels. Our experimental results show that recovering the performance level attainable by non-private models is possible but comes at a considerable coverage cost as the privacy budget decreases.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2305.18393

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Better Selective Classification

Feng, Leo, Ahmed, Mohamed Osama, Hajimirsadeghi, Hossein, Abdi, Amir

arXiv.org Artificial IntelligenceMar-1-2023

We tackle the problem of Selective Classification where the objective is to achieve the best performance on a predetermined ratio (coverage) of the dataset. Recent state-of-the-art selective methods come with architectural changes either via introducing a separate selection head or an extra abstention logit. In this paper, we challenge the aforementioned methods. The results suggest that the superior performance of state-of-the-art methods is owed to training a more generalizable classifier rather than their proposed selection mechanisms. We argue that the best performing selection mechanism should instead be rooted in the classifier itself. Our proposed selection strategy uses the classification scores and achieves better results by a significant margin, consistently, across all coverages and all datasets, without any added compute cost. Furthermore, inspired by semi-supervised learning, we propose an entropy-based regularizer that improves the performance of selective classification methods. Our proposed selection mechanism with the proposed entropy-based regularizer achieves new state-of-the-art results. A model's ability to abstain from a decision when lacking confidence is essential in mission-critical applications. This is known as the Selective Prediction problem setting. The abstained and uncertain samples can be flagged and passed to a human expert for manual assessment, which, in turn, can improve the re-training process. This is crucial in problem settings where confidence is critical or an incorrect prediction can have significant consequences such as in the financial, medical, or autonomous driving domains. Several papers have tried to address this problem by estimating the uncertainty in the prediction.

artificial intelligence, machine learning, selection mechanism, (15 more...)

arXiv.org Artificial Intelligence

2206.09034

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Self-Adaptive Training: Bridging Supervised and Self-Supervised Learning

Huang, Lang, Zhang, Chao, Zhang, Hongyang

arXiv.org Artificial IntelligenceOct-14-2022

We propose self-adaptive training -- a unified training algorithm that dynamically calibrates and enhances training processes by model predictions without incurring an extra computational cost -- to advance both supervised and self-supervised learning of deep neural networks. We analyze the training dynamics of deep networks on training data that are corrupted by, e.g., random noise and adversarial examples. Our analysis shows that model predictions are able to magnify useful underlying information in data and this phenomenon occurs broadly even in the absence of any label information, highlighting that model predictions could substantially benefit the training processes: self-adaptive training improves the generalization of deep networks under noise and enhances the self-supervised representation learning. The analysis also sheds light on understanding deep learning, e.g., a potential explanation of the recently-discovered double-descent phenomenon in empirical risk minimization and the collapsing issue of the state-of-the-art self-supervised learning algorithms. Experiments on the CIFAR, STL, and ImageNet datasets verify the effectiveness of our approach in three applications: classification with label noise, selective classification, and linear evaluation. To facilitate future research, the code has been made publicly available at https://github.com/LayneH/self-adaptive-training.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2101.08732

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Government (0.46)
Education > Educational Setting (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Generalization by Recognizing Confusion

Chiu, Daniel, Wang, Franklyn, Kominers, Scott Duke

arXiv.org Machine LearningJun-13-2020

A recently-proposed technique called self-adaptive training augments modern neural networks by allowing them to adjust training labels on the fly, to avoid overfitting to samples that may be mislabeled or otherwise non-representative. By combining the self-adaptive objective with mixup, we further improve the accuracy of self-adaptive models for image recognition; the resulting classifier obtains state-of-the-art accuracies on datasets corrupted with label noise. Robustness to label noise implies a lower generalization gap; thus, our approach also leads to improved generalizability. We find evidence that the Rademacher complexity of these algorithms is low, suggesting a new path towards provable generalization for this type of deep learning model. Last, we highlight a novel connection between difficulties accounting for rare classes and robustness under noise, as rare classes are in a sense indistinguishable from label noise. Our code can be found at https://github.com/Tuxianeer/generalizationconfusion.

artificial intelligence, machine learning, self-adaptive training, (19 more...)

arXiv.org Machine Learning

2006.07737

Country: