Invariance Pair-Guided Learning: Enhancing Robustness in Neural Networks

Surner, Martin, Khelil, Abdelmajid, Bothmann, Ludwig

Feb-26-2025–arXiv.org Artificial Intelligence

Out-of-distribution generalization of machine learning models remains challenging since the models are inherently bound to the training data distribution. This especially manifests, when the learned models rely on spurious correlations. Most of the existing approaches apply data manipulation, representation learning, or learning strategies to achieve generalizable models. Unfortunately, these approaches usually require multiple training domains, group labels, specialized augmentation, or pre-processing to reach generalizable models. We propose a novel approach that addresses these limitations by providing a technique to guide the neural network through the training phase. We first establish input pairs, representing the spurious attribute and describing the invariance, a characteristic that should not affect the outcome of the model. Based on these pairs, we form a corrective gradient complementing the traditional gradient descent approach. We further make this correction mechanism adaptive based on a predefined invariance condition. Experiments on ColoredMNIST, Waterbird-100, and CelebA datasets demonstrate the effectiveness of our approach and the robustness to group shifts. The first two loss gradients are scaled to two-thirds the length of the corrective gradient due to the violation of the invariance condition. Invariance pairs for (b) ColoredMNIST, (c) Waterbird-100, and (d) CelebA are used for the invariance condition and corrective gradient formulation. 1 Introduction he ability to learn representations from data makes neural networks highly applicable to various tasks. However, models are inherently limited by the distribution of the training data. In training machine learning models, we usually assume that training data and test data are independent and identically distributed (i.i.d.) samples from the same data-generating process, yet this assumption often does not hold in real-world scenarios.

international conference, invariance, spurious correlation, (16 more...)

arXiv.org Artificial Intelligence

Feb-26-2025

arXiv.org PDF

Add feedback

Country:
- Asia > India (0.04)
- Europe > Germany
  - Bavaria > Upper Bavaria > Munich (0.04)

Genre:
- Research Report > Promising Solution (0.48)
- Overview > Innovation (0.34)

Industry:
- Health & Medicine (0.47)
- Transportation (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found