Invariance Pair-Guided Learning: Enhancing Robustness in Neural Networks
Surner, Martin, Khelil, Abdelmajid, Bothmann, Ludwig
–arXiv.org Artificial Intelligence
Out-of-distribution generalization of machine learning models remains challenging since the models are inherently bound to the training data distribution. This especially manifests, when the learned models rely on spurious correlations. Most of the existing approaches apply data manipulation, representation learning, or learning strategies to achieve generalizable models. Unfortunately, these approaches usually require multiple training domains, group labels, specialized augmentation, or pre-processing to reach generalizable models. We propose a novel approach that addresses these limitations by providing a technique to guide the neural network through the training phase. We first establish input pairs, representing the spurious attribute and describing the invariance, a characteristic that should not affect the outcome of the model. Based on these pairs, we form a corrective gradient complementing the traditional gradient descent approach. We further make this correction mechanism adaptive based on a predefined invariance condition. Experiments on ColoredMNIST, Waterbird-100, and CelebA datasets demonstrate the effectiveness of our approach and the robustness to group shifts. The first two loss gradients are scaled to two-thirds the length of the corrective gradient due to the violation of the invariance condition. Invariance pairs for (b) ColoredMNIST, (c) Waterbird-100, and (d) CelebA are used for the invariance condition and corrective gradient formulation. 1 Introduction he ability to learn representations from data makes neural networks highly applicable to various tasks. However, models are inherently limited by the distribution of the training data. In training machine learning models, we usually assume that training data and test data are independent and identically distributed (i.i.d.) samples from the same data-generating process, yet this assumption often does not hold in real-world scenarios.
arXiv.org Artificial Intelligence
Feb-26-2025
- Genre:
- Research Report > Promising Solution (0.48)
- Overview > Innovation (0.34)
- Industry:
- Health & Medicine (0.47)
- Transportation (0.46)
- Technology: