AITopics | orthogonality regularization

Collaborating Authors

orthogonality regularization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

Neural Information Processing SystemsApr-25-2026, 13:15:12 GMT

Modern convolutional neural networks (CNNs) have massive identical convolution blocks, and, hence, recursive sharing of parameters across these blocks has been proposed to reduce the amount of parameters. However, naive sharing of parameters poses many challenges such as limited representational power and the vanishing/exploding gradients problem of recursively shared parameters. In this paper, we present a recursive convolution block design and training method, in which a recursively shareable part, or a filter basis, is separated and learned while effectively avoiding the vanishing/exploding gradients problem during training. We show that the unwieldy vanishing/exploding gradients problem can be controlled by enforcing the elements of the filter basis orthonormal, and empirically demonstrate that the proposed orthogonality regularization improves the flow of gradients during training. Experimental results on image classification and object detection show that our approach, unlike previous parameter-sharing approaches, does not trade performance to save parameters and consistently outperforms overparameterized counterpart networks. This superior performance demonstrates that the proposed recursive convolution block design and the orthogonality regularization not only prevent performance degradation, but also consistently improve the representation capability while a significant amount of parameters are recursively shared.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can We Gain More from Orthogonality Regularizations in Training Deep Networks?

Nitin Bansal, Xiaohan Chen, Zhangyang Wang

Neural Information Processing SystemsFeb-14-2026, 10:04:46 GMT

Neural Information Processing Systems http://nips.cc/

arxiv preprint arxiv, regularization, regularizer, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

3cf2559725a9fdfa602ec8c887440f32-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 07:35:21 GMT

convolution operator, filter basis, orthogonality regularization, (13 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Incheon > Incheon (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can We Gain More from Orthogonality Regularizations in Training Deep Networks?

Neural Information Processing SystemsNov-20-2025, 22:56:30 GMT

This paper seeks to answer the question: as the (near-) orthogonality of weights is found to be a favorable property for training deep convolutional neural networks, how can we enforce it in more effective and easy-to-use ways? We develop novel orthogonality regularizations on training deep CNNs, utilizing various advanced analytical tools such as mutual coherence and restricted isometry property. These plug-and-play regularizations can be conveniently incorporated into training almost any CNN without extra hassle.

name change, orthogonality regularization, training deep network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Can We Gain More from Orthogonality Regularizations in Training Deep Networks?

Nitin Bansal, Xiaohan Chen, Zhangyang Wang

Neural Information Processing SystemsNov-20-2025, 19:43:12 GMT

As we will explain later, existing works employ the most obvious but not necessarily appropriate option.

artificial intelligence, machine learning, regularization, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Feedback Alignment Meets Low-Rank Manifolds: A Structured Recipe for Local Learning

Roy, Arani, Apolinario, Marco P., Biswas, Shristi Das, Roy, Kaushik

arXiv.org Artificial IntelligenceOct-30-2025

Training deep neural networks (DNNs) with backpropagation (BP) achieves state-of-the-art accuracy but requires global error propagation and full parameterization, leading to substantial memory and computational overhead. Direct Feedback Alignment (DFA) enables local, parallelizable updates with lower memory requirements but is limited by unstructured feedback and poor scalability in deeper architectures, specially convolutional neural networks. To address these limitations, we propose a structured local learning framework that operates directly on low-rank manifolds defined by the Singular Value Decomposition (SVD) of weight matrices. Each layer is trained in its decomposed form, with updates applied to the SVD components using a composite loss that integrates cross-entropy, subspace alignment, and orthogonality regularization. Feedback matrices are constructed to match the SVD structure, ensuring consistent alignment between forward and feedback pathways. Our method reduces the number of trainable parameters relative to the original DFA model, without relying on pruning or post hoc compression. Experiments on CIFAR-10, CIFAR-100, and ImageNet show that our method achieves accuracy comparable to that of BP. Ablation studies confirm the importance of each loss term in the low-rank setting. These results establish local learning on low-rank manifolds as a principled and scalable alternative to full-rank gradient-based training.

alignment, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.25594

Genre: Research Report (0.82)

Industry: Education (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Preventing Dimensional Collapse in Self-Supervised Learning via Orthogonality Regularization

Neural Information Processing SystemsMay-27-2025, 12:31:30 GMT

Self-supervised learning (SSL) has rapidly advanced in recent years, approaching the performance of its supervised counterparts through the extraction of representations from unlabeled data. However, dimensional collapse, where a few large eigenvalues dominate the eigenspace, poses a significant obstacle for SSL. When dimensional collapse occurs on features (e.g. To this end, we first time propose a mitigation approach employing orthogonal regularization (OR) across the encoder, targeting both convolutional and linear layers during pretraining. OR promotes orthogonality within weight matrices, thus safeguarding against the dimensional collapse of weight matrices, hidden features, and representations. Our empirical investigations demonstrate that OR significantly enhances the performance of SSL methods across diverse benchmarks, yielding consistent gains with both CNNs and Transformer-based architectures.

dimensional collapse, preventing dimensional collapse, representation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.65)

Add feedback

FedOC: Optimizing Global Prototypes with Orthogonality Constraints for Enhancing Embeddings Separation in Heterogeneous Federated Learning

Guo, Fucheng, Luan, Zeyu, Li, Qing, Zhao, Dan, Jiang, Yong

arXiv.org Artificial IntelligenceFeb-22-2025

Federated Learning (FL) has emerged as an essential framework for distributed machine learning, especially with its potential for privacy-preserving data processing. However, existing FL frameworks struggle to address statistical and model heterogeneity, which severely impacts model performance. While Heterogeneous Federated Learning (HtFL) introduces prototype-based strategies to address the challenges, current approaches face limitations in achieving optimal separation of prototypes. This paper presents FedOC, a novel HtFL algorithm designed to improve global prototype separation through orthogonality constraints, which not only increase intra-class prototype similarity but also significantly expand the inter-class angular separation. With the guidance of the global prototype, each client keeps its embeddings aligned with the corresponding prototype in the feature space, promoting directional independence that integrates seamlessly with the cross-entropy (CE) loss. We provide theoretical proof of FedOC's convergence under non-convex conditions. Extensive experiments demonstrate that FedOC outperforms seven state-of-the-art baselines, achieving up to a 10.12% accuracy improvement in both statistical and model heterogeneity settings.

global prototype, learning, prototype, (14 more...)

arXiv.org Artificial Intelligence

2502.16119

Country:

North America > United States > Virginia (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.46)
Information Technology > Software (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.34)

Add feedback

Learnable Similarity and Dissimilarity Guided Symmetric Non-Negative Matrix Factorization

Lyu, Wenlong, Jia, Yuheng

arXiv.org Artificial IntelligenceDec-5-2024

Symmetric nonnegative matrix factorization (SymNMF) is a powerful tool for clustering, which typically uses the $k$-nearest neighbor ($k$-NN) method to construct similarity matrix. However, $k$-NN may mislead clustering since the neighbors may belong to different clusters, and its reliability generally decreases as $k$ grows. In this paper, we construct the similarity matrix as a weighted $k$-NN graph with learnable weight that reflects the reliability of each $k$-th NN. This approach reduces the search space of the similarity matrix learning to $n - 1$ dimension, as opposed to the $\mathcal{O}(n^2)$ dimension of existing methods, where $n$ represents the number of samples. Moreover, to obtain a discriminative similarity matrix, we introduce a dissimilarity matrix with a dual structure of the similarity matrix, and propose a new form of orthogonality regularization with discussions on its geometric interpretation and numerical stability. An efficient alternative optimization algorithm is designed to solve the proposed model, with theoretically guarantee that the variables converge to a stationary point that satisfies the KKT conditions. The advantage of the proposed model is demonstrated by the comparison with nine state-of-the-art clustering methods on eight datasets. The code is available at \url{https://github.com/lwl-learning/LSDGSymNMF}.

factorization, matrix factorization, regularization, (16 more...)

arXiv.org Artificial Intelligence

2412.04082

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Add feedback

Reviews: Can We Gain More from Orthogonality Regularizations in Training Deep Networks?

Neural Information Processing SystemsOct-8-2024, 02:32:20 GMT

In extensive experiments with state of the art models, the paper shows that soft orthogonality can improve training stability and yield better classification accuracy than the same models trained without such regularization. The paper proposes a method to approximately enforce all singular values of the weight matrices to be equal to 1, using a sampling-based approach that does not require computing an expensive SVD operation. Major comments: This paper presents interesting experiments showing that regularization towards orthogonal weights can stabilize and speed up learning, particularly near the beginning of training; and improve final test accuracy in several large models. These results could be of broad interest. One concern with the experimental methods is that they use carefully sculpted hyper parameter trajectories for some methods. How were these trajectories selected?

experiment, orthogonality regularization, training deep network, (6 more...)

Neural Information Processing Systems

Genre: Research Report (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

Add feedback