AITopics | invariance

Collaborating Authors

invariance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels

Paquin, Alexandre Lemire, Chaib-Draa, Brahim, Giguère, Philippe

arXiv.org Machine LearningMay-21-2026

Labeling a training set is often expensive and susceptible to errors, making the design of robust loss functions for label noise an important problem. The symmetry condition provides theoretical guarantees for robustness to such noise. In this work, we study a symmetrization method arising from the unique decomposition of any multi-class loss function into a symmetric component and a class-insensitive term. In particular, symmetrizing the cross-entropy loss leads to a linear multi-class extension of the unhinged loss. Unlike in the binary case, the multi-class version must have specific coefficients in order to satisfy the symmetry condition. Under suitable assumptions, we show that this multi-class unhinged loss is the unique convex multi-class symmetric loss. We also show that it has a fundamental local role: the linear approximation of any symmetric loss around score vectors with equal components is equivalent to the multi-class unhinged loss. We then introduce SGCE and alpha-MAE, two loss functions that interpolate between the multi-class unhinged loss and the Mean Absolute Error while allowing control of the beta-smoothness of the loss. Experiments on standard noisy-label benchmarks show competitive performance compared with existing robust loss functions.

artificial intelligence, loss function, machine learning, (18 more...)

arXiv.org Machine Learning

2605.20347

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The Geometry of Projection Heads: Conditioning, Invariance, and Collapse

Chaudhry, Faris

arXiv.org Machine LearningMay-19-2026

We develop a geometric theory of projection heads in self-supervised learning by modeling the head as a trainable Riemannian metric on the backbone representation manifold. We show that linear heads perform implicit subspace whitening, while nonlinear heads adapt local metrics to satisfy the specific topological constraints of the loss, with head depth empirically dictating this capacity. Analyzing dimensional collapse, we prove that smooth nonlinear heads natively induce negative eigenvalues in the Hessian at collapsed equilibria, making them unstable. We empirically validate this by continuously tracking the optimization geometry during training, which reveals that smooth activations like Swish can generate explicit negative curvature to escape collapse, whereas linear and ReLU heads under continuous-time gradient flow cannot, relying instead on discrete-time optimization dynamics and BatchNorm. Finally, we geometrically characterize how metric degeneracy governs the information-invariance trade-off, explaining why the head must be discarded. Evaluated across contrastive and decorrelation-based objectives on foundation models, our results demonstrate that the projection head acts as a universal geometric buffer, decoupling the semantic backbone from the rigid, destructive constraints of the pretraining objective.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

2605.1718

Country: Asia (0.45)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

fe4da14f07561a232782820d30ea22f3-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 10:25:03 GMT

artificial intelligence, machine learning, mutual information, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Provably Strict Generalisation Benefit for Invariance in Kernel Methods

Neural Information Processing SystemsApr-26-2026, 16:25:44 GMT

It is a commonly held belief that enforcing invariance improves generalisation. Although this approach enjoys widespread popularity, it is only very recently that a rigorous theoretical demonstration of this benefit has been established. In this work we build on the function space perspective of Elesedy and Zaidi [8] to derive a strictly non-zero generalisation benefit of incorporating invariance in kernel ridge regression when the target is invariant to the action of a compact group. We study invariance enforced by feature averaging and find that generalisation is governed by a notion of effective dimension that arises from the interplay between the kernel and the group. In building towards this result, we find that the action of the group induces an orthogonal decomposition of both the reproducing kernel Hilbert space and its kernel, which may be of interest in its own right.

artificial intelligence, invariance, machine learning, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.41)

Add feedback

Laplacian Canonization: AMinimalist Approach to Sign and Basis Invariant Spectral Embedding

Neural Information Processing SystemsApr-26-2026, 00:23:55 GMT

Spectral embedding is a powerful graph embedding technique that has received a lot of attention recently due to its effectiveness on Graph Transformers. However, from a theoretical perspective, the universal expressive power of spectral embedding comes at the price of losing two important invariance properties of graphs, sign and basis invariance, which also limits its effectiveness on graph data. To remedy this issue, many previous methods developed costly approaches to learn new invariants and suffer from high computation complexity. In this work, we explore a minimal approach that resolves the ambiguity issues by directly finding canonical directions for the eigenvectors, named Laplacian Canonization (LC). As a pure pre-processing method, LC is light-weighted and can be applied to any existing GNNs. We provide a thorough investigation, from theory to algorithm, on this approach, and discover an efficient algorithm named Maximal Axis Projection (MAP) that works for both sign and basis invariance and successfully canonizes more than 90% of all eigenvectors. Experiments on real-world benchmark datasets like ZINC, MOLTOX21, and MOLPCBA show that MAP consistently outperforms existing methods while bringing minimal computation overhead.

artificial intelligence, eigenvector, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

ARoto translation invariance

Neural Information Processing SystemsApr-25-2026, 09:56:14 GMT

A.1 Rotations in 2 dimensions In 2-dimensional settings, there exists a single scalar angular position, the yaw angle θ. In order to perform the transformation, we have to express the angular positions in a format suitable for linear transformations; we do so by transforming them to rotation matrices, perform a matrix multiplication, and then transform the angular positions back to angle format. In 2 dimensions, we use eq. After the rotation, we can convert them back to angle format using the 2-argument arc-tangent function: θ = atan2(sinθ,cosθ) (14) Simplified rotations In 2 dimensions, the computations can be simplified since rotations commute. First, we show that chained rotations result in angle addition/subtraction, that is: Q(θi) Q(θj) = cosθi sinθi sinθicosθi cosθj sinθj sinθjcosθj (15) = cosθicosθj sinθisinθj cosθisinθj sinθicosθj sinθicosθj +cosθisinθj sinθisinθj +cosθicosθj (16) = cos(θi +θj) sin(θi +θj) sin(θi +θj) cos(θi +θj) (17) = Q(θi +θj) (18) Following the same approach, we compute the inverse rotation: Q (θi) Q(θj) = Q( θi) Q(θj) = Q(θj θi) (19) Thus, instead of rotating the angular positions (expressed in rotation matrix form) using the rotation matrix Q, in practice we perform the transformation directly to the angles via addition/subtraction, and replace the matrix Qwith the identity matrix I1 1.

artificial intelligence, dimension, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Roto-translated Local Coordinate Frames For Interacting Dynamical Systems

Neural Information Processing SystemsApr-25-2026, 09:56:10 GMT

Modelling interactions is critical in learning complex dynamical systems, namely systems of interacting objects with highly non-linear and time-dependent behaviour. A large class of such systems can be formalized as geometric graphs, i.e., graphs with nodes positioned in the Euclidean space given an arbitrarily chosen global coordinate system, for instance vehicles in a traffic scene. Notwithstanding the arbitrary global coordinate system, the governing dynamics of the respective dynamical systems are invariant to rotations and translations, also known as Galilean invariance. As ignoring these invariances leads to worse generalization, in this work we propose local coordinate frames per node-object to induce roto-translation invariance to the geometric graph of the interacting dynamical system. Further, the local coordinate frames allow for a natural definition of anisotropic filtering in graph neural networks. Experiments in traffic scenes, 3D motion capture, and colliding particles demonstrate that the proposed approach comfortably outperforms the recent state-of-the-art.

coordinate frame, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Learning Causal Semantic Representation for Out-of-Distribution Prediction

Neural Information Processing SystemsApr-25-2026, 08:49:22 GMT

Conventional supervised learning methods, especially deep ones, are found to be sensitive to out-of-distribution (OOD) examples, largely because the learned representation mixes the semantic factor with the variation factor due to their domain-specific correlation, while only the semantic factor causes the output. To address the problem, we propose a Causal Semantic Generative model (CSG) based on a causal reasoning so that the two factors are modeled separately, and develop methods for OOD prediction from a single training domain, which is common and challenging. The methods are based on the causal invariance principle, with a novel design in variational Bayes for both efficient learning and easy prediction. Theoretically, we prove that under certain conditions, CSG can identify the semantic factor by fitting training data, and this semantic-identification guarantees the boundedness of OOD generalization error and the success of adaptation. Empirical study shows improved OOD performance over prevailing baselines.

artificial intelligence, international conference, machine learning, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.93)
Europe (0.67)
North America > United States > California (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Filters

Collaborating Authors

invariance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels

The Geometry of Projection Heads: Conditioning, Invariance, and Collapse

fe4da14f07561a232782820d30ea22f3-Supplemental-Conference.pdf

e32349fe7e3cd4f9ef598c2b7b7a31f4-Paper-Conference.pdf

Provably Strict Generalisation Benefit for Invariance in Kernel Methods

581b41df0cd50ace849e061ef74827fc-Paper.pdf

Laplacian Canonization: AMinimalist Approach to Sign and Basis Invariant Spectral Embedding

ARoto translation invariance

Roto-translated Local Coordinate Frames For Interacting Dynamical Systems

Learning Causal Semantic Representation for Out-of-Distribution Prediction