AITopics | Englesson, Erik

Collaborating Authors

Englesson, Erik

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Indirectly Parameterized Concrete Autoencoders

Nilsson, Alfred, Wijk, Klas, Gutha, Sai bharath chandra, Englesson, Erik, Hotti, Alexandra, Saccardi, Carlo, Kviman, Oskar, Lagergren, Jens, Vinuesa, Ricardo, Azizpour, Hossein

arXiv.org Machine LearningMar-1-2024

Feature selection is a crucial task in settings where data is high-dimensional or acquiring the full set of features is costly. Recent developments in neural network-based embedded feature selection show promising results across a wide range of applications. Concrete Autoencoders (CAEs), considered state-of-the-art in embedded feature selection, may struggle to achieve stable joint optimization, hurting their training time and generalization. In this work, we identify that this instability is correlated with the CAE learning duplicate selections. To remedy this, we propose a simple and effective improvement: Indirectly Parameterized CAEs (IP-CAEs). IP-CAEs learn an embedding and a mapping from it to the Gumbel-Softmax distributions' parameters. Despite being simple to implement, IP-CAE exhibits significant and consistent improvements over CAE in both generalization and training time across several datasets for reconstruction and classification. Unlike CAE, IP-CAE effectively leverages non-linear relationships and does not require retraining the jointly optimized decoder. Furthermore, our approach is, in principle, generalizable to Gumbel-Softmax distributions beyond feature selection.

artificial intelligence, ip-cae, machine learning, (18 more...)

arXiv.org Machine Learning

2403.00563

Country: Europe > Sweden (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Logistic-Normal Likelihoods for Heteroscedastic Label Noise

Englesson, Erik, Mehrpanah, Amir, Azizpour, Hossein

arXiv.org Artificial IntelligenceAug-14-2023

A natural way of estimating heteroscedastic label noise in regression is to model the observed (potentially noisy) target as a sample from a normal distribution, whose parameters can be learned by minimizing the negative log-likelihood. This formulation has desirable loss attenuation properties, as it reduces the contribution of high-error examples. Intuitively, this behavior can improve robustness against label noise by reducing overfitting. We propose an extension of this simple and probabilistic approach to classification that has the same desirable loss attenuation properties. Furthermore, we discuss and address some practical challenges of this extension. We evaluate the effectiveness of the method by measuring its robustness against label noise in classification. We perform enlightening experiments exploring the inner workings of the method, including sensitivity to hyperparameters, ablation studies, and other insightful analyses.

artificial intelligence, machine learning, noise, (17 more...)

arXiv.org Artificial Intelligence

2304.02849

Country: Europe (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Deep Double Descent via Smooth Interpolation

Gamba, Matteo, Englesson, Erik, Björkman, Mårten, Azizpour, Hossein

arXiv.org Artificial IntelligenceApr-8-2023

The ability of overparameterized deep networks to interpolate noisy data, while at the same time showing good generalization performance, has been recently characterized in terms of the double descent curve for the test error. Common intuition from polynomial regression suggests that overparameterized networks are able to sharply interpolate noisy data, without considerably deviating from the ground-truth signal, thus preserving generalization ability. At present, a precise characterization of the relationship between interpolation and generalization for deep networks is missing. In this work, we quantify sharpness of fit of the training data interpolated by neural network functions, by studying the loss landscape w.r.t. to the input variable locally to each training point, over volumes around cleanly- and noisily-labelled training samples, as we systematically increase the number of model parameters and training epochs. Our findings show that loss sharpness in the input space follows both model- and epoch-wise double descent, with worse peaks observed around noisy labels. While small interpolating models sharply fit both clean and noisy data, large interpolating models express a smooth loss landscape, where noisy targets are predicted over large volumes around training data points, in contrast to existing intuition.

artificial intelligence, augmentation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2209.1008

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Consistency Regularization Can Improve Robustness to Label Noise

Englesson, Erik, Azizpour, Hossein

arXiv.org Machine LearningOct-4-2021

Consistency regularization is a commonly-used technique for semi-supervised and self-supervised learning. It is an auxiliary objective function that encourages the prediction of the network to be similar in the vicinity of the observed training samples. Hendrycks et al. (2020) have recently shown such regularization naturally brings test-time robustness to corrupted data and helps with calibration. This paper empirically studies the relevance of consistency regularization for training-time robustness to noisy labels. First, we make two interesting and useful observations regarding the consistency of networks trained with the standard cross entropy loss on noisy datasets which are: (i) networks trained on noisy data have lower consistency than those trained on clean data, and(ii) the consistency reduces more significantly around noisy-labelled training data points than correctly-labelled ones. Then, we show that a simple loss function that encourages consistency improves the robustness of the models to label noise on both synthetic (CIFAR-10, CIFAR-100) and real-world (WebVision) noise as well as different noise rates and types and achieves state-of-the-art results.

artificial intelligence, ground transportation, machine learning, (16 more...)

arXiv.org Machine Learning

2110.01242

Country: Europe > Sweden (0.14)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

Englesson, Erik, Azizpour, Hossein

arXiv.org Machine LearningMay-11-2021

We propose two novel loss functions based on Jensen-Shannon divergence for learning under label noise. Following the work of Ghosh et al. (2017), we argue about their theoretical robustness. Furthermore, we reveal several other desirable properties by drawing informative connections to various loss functions, e.g., cross entropy, mean absolute error, generalized cross entropy, symmetric cross entropy, label smoothing, and most importantly consistency regularization. We conduct extensive and systematic experiments using both synthetic (CIFAR) and real (WebVision) noise and demonstrate significant and consistent improvements over other loss functions. Also, we conduct several informative side experiments that highlight the different theoretical properties.

deep learning, loss function, neural network, (21 more...)

arXiv.org Machine Learning

2105.04522

Country: Europe > Sweden (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology (0.67)
Transportation > Ground > Road (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Efficient Evaluation-Time Uncertainty Estimation by Improved Distillation

Englesson, Erik, Azizpour, Hossein

arXiv.org Machine LearningJun-12-2019

In this work we aim to obtain computationally-efficient uncertainty estimates with deep networks. For this, we propose a modified knowledge distillation procedure that achieves state-of-the-art uncertainty estimates both for in and out-of-distribution samples. Our contributions include a) demonstrating and adapting to distillation's regularization effect b) proposing a novel target teacher distribution c) a simple augmentation procedure to improve out-of-distribution uncertainty estimates d) shedding light on the distillation procedure through comprehensive set of experiments.

deep learning, neural network, student, (19 more...)

arXiv.org Machine Learning

1906.05419

Country:

North America > Canada (0.14)
Europe > Sweden (0.14)

Genre: Research Report (0.50)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback