AITopics | regularizing

Collaborating Authors

regularizing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Regularizing Towards Permutation Invariance In Recurrent Models

Neural Information Processing SystemsDec-24-2025, 16:51:27 GMT

In many machine learning problems the output should not depend on the order of the inputs. Such ``permutation invariant'' functions have been studied extensively recently. Here we argue that temporal architectures such as RNNs are highly relevant for such problems, despite the inherent dependence of RNNs on order. We show that RNNs can be regularized towards permutation invariance, and that this can result in compact models, as compared to non-recursive architectures. Existing solutions (e.g., DeepSets) mostly suggest restricting the learning problem to hypothesis classes which are permutation invariant by design. Our approach of enforcing permutation invariance via regularization gives rise to learning functions which are semi permutation invariant, e.g.

name change, permutation invariance, regularizing, (5 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Regularizing by the Variance of the Activations' Sample-Variances

Neural Information Processing SystemsNov-20-2025, 22:12:43 GMT

Normalization techniques play an important role in supporting efficient and often more effective training of deep neural networks. While conventional methods explicitly normalize the activations, we suggest to add a loss term instead. This new loss term encourages the variance of the activations to be stable and not vary from one random mini-batch to the next. As we prove, this encourages the activations to be distributed around a few distinct modes. We also show that if the inputs are from a mixture of two Gaussians, the new loss would either join the two together, or separate between them optimally in the LDA sense, depending on the prior probabilities. Finally, we are able to link the new regularization term to the batchnorm method, which provides it with a regularization perspective. Our experiments demonstrate an improvement in accuracy over the batchnorm technique for both CNNs and fully connected networks.

activation, name change, regularizing, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Regularizing Towards Permutation Invariance In Recurrent Models

Neural Information Processing SystemsOct-11-2024, 11:22:01 GMT

In many machine learning problems the output should not depend on the order of the inputs. Such permutation invariant'' functions have been studied extensively recently. Here we argue that temporal architectures such as RNNs are highly relevant for such problems, despite the inherent dependence of RNNs on order. We show that RNNs can be regularized towards permutation invariance, and that this can result in compact models, as compared to non-recursive architectures. Existing solutions (e.g., DeepSets) mostly suggest restricting the learning problem to hypothesis classes which are permutation invariant by design.

permutation invariance, permutation invariant, recurrent model, (2 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Regularizing Towards Soft Equivariance Under Mixed Symmetries

Kim, Hyunsu, Lee, Hyungi, Yang, Hongseok, Lee, Juho

arXiv.org Artificial IntelligenceJun-1-2023

Datasets often have their intrinsic symmetries, and particular deep-learning models called equivariant or invariant models have been developed to exploit these symmetries. However, if some or all of these symmetries are only approximate, which frequently happens in practice, these models may be suboptimal due to the architectural restrictions imposed on them. We tackle this issue of approximate symmetries in a setup where symmetries are mixed, i.e., they are symmetries of not single but multiple different types and the degree of approximation varies across these types. Instead of proposing a new architectural restriction as in most of the previous approaches, we present a regularizer-based method for building a model for a dataset with mixed approximate symmetries. The key component of our method is what we call equivariance regularizer for a given type of symmetries, which measures how much a model is equivariant with respect to the symmetries of the type. Our method is trained with these regularizers, one per each symmetry type, and the strength of the regularizers is automatically tuned during training, leading to the discovery of the approximation levels of some candidate symmetry types without explicit supervision. Using synthetic function approximation and motion forecasting tasks, we demonstrate that our method achieves better accuracy than prior approaches while discovering the approximate symmetry levels correctly.

artificial intelligence, equivariance, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2306.00356

Country:

Asia > South Korea > Daejeon > Daejeon (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Regularizing by the Variance of the Activations' Sample-Variances

Littwin, Etai, Wolf, Lior

Neural Information Processing SystemsFeb-14-2020, 09:43:14 GMT

activation, regularizing, sample-variance

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

A general method for regularizing tensor decomposition methods via pseudo-data

Gottesman, Omer, Pan, Weiwei, Doshi-Velez, Finale

arXiv.org Machine LearningMay-24-2019

Tensor decomposition methods (TDMs) have recently gained popularity as ways of performing inference for latent variable models [Anandkumar et al., 2014]. The interest in these methods is motivated by the fact that they come with theoretical global convergence guarantees in the limit of infinite data [Anandkumar et al., 2012, Arora et al., 2013]. However, a main limitation of these methods is that they lack natural methods for regularization or encouraging desired properties on the model parameters when the amount of data is limited. Previous works attempted to alleviate this drawback by modifying existing tensor decomposition methods to incorporate specific constraints, such as sparsity [Sun et al., 2015], or incorporate modeling assumptions, such as the existence of anchor words [Arora et al., 2013, Nguyen et al., 2014]. All of these works develop bespoke algorithms tailored to those constraints or assumptions. Furthermore, many of these methods impose hard constraints on the learned model, which may be detrimental as the size of the data grow--framed in the context of Bayesian intuition, when we have a lot of data, we want our methods to allow the evidence to overwhelm our priors. We introduce an alternative approach which can be applied to encourage any (differentiable) desired structure or properties on the model parameters, and which will only encourage this "prior" information when the data is insufficient. Specifically, we adopt the common view of Bayesian priors as representing "pseudo-observations" of artificial data which bias our learned model parameters towards our prior belief [Bishop, 2006]. We apply the tensor decomposition method of Anandkumar et al.

artificial intelligence, machine learning, regularizer, (17 more...)

arXiv.org Machine Learning

1905.10424

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback