AITopics | symmetrization

Collaborating Authors

symmetrization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels

Paquin, Alexandre Lemire, Chaib-Draa, Brahim, Giguère, Philippe

arXiv.org Machine LearningMay-21-2026

Labeling a training set is often expensive and susceptible to errors, making the design of robust loss functions for label noise an important problem. The symmetry condition provides theoretical guarantees for robustness to such noise. In this work, we study a symmetrization method arising from the unique decomposition of any multi-class loss function into a symmetric component and a class-insensitive term. In particular, symmetrizing the cross-entropy loss leads to a linear multi-class extension of the unhinged loss. Unlike in the binary case, the multi-class version must have specific coefficients in order to satisfy the symmetry condition. Under suitable assumptions, we show that this multi-class unhinged loss is the unique convex multi-class symmetric loss. We also show that it has a fundamental local role: the linear approximation of any symmetric loss around score vectors with equal components is equivalent to the multi-class unhinged loss. We then introduce SGCE and alpha-MAE, two loss functions that interpolate between the multi-class unhinged loss and the Mean Absolute Error while allowing control of the beta-smoothness of the loss. Experiments on standard noisy-label benchmarks show competitive performance compared with existing robust loss functions.

artificial intelligence, loss function, machine learning, (18 more...)

arXiv.org Machine Learning

2605.20347

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Appendix A.1 Proofs A.1.1 Proof of Theorem 1 (Section 2.1) Theorem 1. If p

Neural Information Processing SystemsFeb-10-2026, 14:12:56 GMT

Let ψ: X Y be an arbitrary G equivariant function. We leave proving this as a future work. We now show the following: Proposition 3. The proposed distribution p We now show the following: Proposition 6. From Eq. (29), we have: ϕ Proposition 7. The proposed symmetrization From Eq. (29), we have: ϕ This is after handling the translation component of the Euclidean group E ( d) / SE (d) as in Eq. (29). We now show the following: Proposition 8. Therefore, probabilistic symmetrization can become frame averaging.

artificial intelligence, equivariance, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Learning Probabilistic Symmetrization for Architecture Agnostic Equivariance

Neural Information Processing SystemsFeb-10-2026, 14:12:52 GMT

We present a novel framework to overcome the limitations of equivariant architectures in learning functions with group symmetries.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

3b5c7c9c5c7bd77eb73d0baec7a07165-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 12:13:06 GMT

equivariance, equivariant, symmetrization, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Two tales for a geometric Jensen--Shannon divergence

Nielsen, Frank

arXiv.org Artificial IntelligenceSep-19-2025

The geometric Jensen--Shannon divergence (G-JSD) gained popularity in machine learning and information sciences thanks to its closed-form expression between Gaussian distributions. In this work, we introduce an alternative definition of the geometric Jensen--Shannon divergence tailored to positive densities which does not normalize geometric mixtures. This novel divergence is termed the extended G-JSD as it applies to the more general case of positive measures. We report explicitly the gap between the extended G-JSD and the G-JSD when considering probability densities, and show how to express the G-JSD and extended G-JSD using the Jeffreys divergence and the Bhattacharyya distance or Bhattacharyya coefficient. The extended G-JSD is proven to be a $f$-divergence which is a separable divergence satisfying information monotonicity and invariance in information geometry. We derive corresponding closed-form formula for the two types of G-JSDs when considering the case of multivariate Gaussian distributions often met in applications. We consider Monte Carlo stochastic estimations and approximations of the two types of G-JSD using the projective $γ$-divergences. Although the square root of the JSD yields a metric distance, we show that this is not anymore the case for the two types of G-JSD. Finally, we explain how these two types of geometric JSDs can be interpreted as regularizations of the ordinary JSD.

artificial intelligence, divergence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.05066

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Effective Field Neural Network

Liu, Xi, Zhao, Yujun, Wan, Chun Yu, Zhang, Yang, Liu, Junwei

arXiv.org Artificial IntelligenceFeb-24-2025

Effective Field Neural Network Xi Liu, 1 Yujun Zhao, 1 Chun Yu Wan, 1 Yang Zhang, 2, 3 and Junwei Liu 1, 1 Department of Physics, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR, China 2 Department of Physics and Astronomy, University of Tennessee, Knoxville, TN 37996, USA 3 Min H. Kao Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee 37996, USA (Dated: February 26, 2025) In recent years, with the rapid development of machine learning, physicists have been exploring its new applications in solving or alleviating the curse of dimensionality in many-body problems. In order to accurately reflect the underlying physics of the problem, domain knowledge must be encoded into the machine learning algorithms. In this work, inspired by field theory, we propose a new set of machine learning models called effective field neural networks (EFNNs) that can automatically and efficiently capture important many-body interactions through multiple self-refining processes. Taking the classical 3-spin infinite-range model and the quantum double exchange model as case studies, we explicitly demonstrate that EFNNs significantly outperform fully-connected deep neural networks (DNNs) and the effective model. Furthermore, with the help of convolution operations, the EFNNs learned in a small system can be seamlessly used in a larger system without additional training and the relative errors even decrease, which further demonstrates the efficacy of EFNNs in representing core physical behaviors.

efnn, interaction, neural network, (16 more...)

arXiv.org Artificial Intelligence

2502.17665

Country:

North America > United States > Tennessee > Knox County > Knoxville (1.00)
Asia > China > Hong Kong (0.45)
North America > United States > New York > New York County > New York City (0.14)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diagonal Symmetrization of Neural Network Solvers for the Many-Electron Schr\"odinger Equation

Huang, Kevin Han, Zhan, Ni, Ertekin, Elif, Orbanz, Peter, Adams, Ryan P.

arXiv.org Artificial IntelligenceFeb-7-2025

Incorporating group symmetries into neural networks has been a cornerstone of success in many AI-for-science applications. Diagonal groups of isometries, which describe the invariance under a simultaneous movement of multiple objects, arise naturally in many-body quantum problems. Despite their importance, diagonal groups have received relatively little attention, as they lack a natural choice of invariant maps except in special cases. We study different ways of incorporating diagonal invariance in neural network ans\"atze trained via variational Monte Carlo methods, and consider specifically data augmentation, group averaging and canonicalization. We show that, contrary to standard ML setups, in-training symmetrization destabilizes training and can lead to worse performance. Our theoretical and numerical results indicate that this unexpected behavior may arise from a unique computational-statistical tradeoff not found in standard ML analyses of symmetrization. Meanwhile, we demonstrate that post hoc averaging is less sensitive to such tradeoffs and emerges as a simple, flexible and effective method for improving neural network solvers.

artificial intelligence, machine learning, wavefunction, (17 more...)

arXiv.org Artificial Intelligence

2502.05318

Country:

North America > United States > Illinois (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

OutlierTune: Efficient Channel-Wise Quantization for Large Language Models

Wang, Jinguang, Yin, Yuexi, Sun, Haifeng, Qi, Qi, Wang, Jingyu, Zhuang, Zirui, Yang, Tingting, Liao, Jianxin

arXiv.org Artificial IntelligenceJun-26-2024

Quantizing the activations of large language models (LLMs) has been a significant challenge due to the presence of structured outliers. Most existing methods focus on the per-token or per-tensor quantization of activations, making it difficult to achieve both accuracy and hardware efficiency. To address this problem, we propose OutlierTune, an efficient per-channel post-training quantization (PTQ) method for the activations of LLMs. OutlierTune consists of two components: pre-execution of dequantization and symmetrization. The pre-execution of dequantization updates the model weights by the activation scaling factors, avoiding the internal scaling and costly additional computational overheads brought by the per-channel activation quantization. The symmetrization further reduces the quantization differences arising from the weight updates by ensuring the balanced numerical ranges across different activation channels. OutlierTune is easy to implement and hardware-efficient, introducing almost no additional computational overheads during the inference. Extensive experiments show that the proposed framework outperforms existing methods across multiple different tasks. Demonstrating better generalization, this framework improves the Int6 quantization of the instruction-tuning LLMs, such as OPT-IML, to the same level as half-precision (FP16). Moreover, we have shown that the proposed framework is 1.48x faster than the FP16 implementation while reducing approximately 2x memory usage.

activation quantization, arxiv preprint arxiv, quantization, (12 more...)

arXiv.org Artificial Intelligence

2406.18832

Country:

North America > United States > New Jersey (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Singapore (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Baking Symmetry into GFlowNets

Ma, George, Bengio, Emmanuel, Bengio, Yoshua, Zhang, Dinghuai

arXiv.org Artificial IntelligenceJun-8-2024

GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects incrementally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphic states. This lack of symmetry increases the amount of samples required for training GFlowNets and can result in inefficient and potentially incorrect flow functions. As a consequence, the reward and diversity of the generated objects decrease. In this study, our objective is to integrate symmetries into GFlowNets by identifying equivalent actions during the generation process. Experimental results using synthetic data demonstrate the promising performance of our proposed approaches.

gflownet, graph, yoshua bengio, (12 more...)

arXiv.org Artificial Intelligence

2406.05426

Country: