AITopics | minimisation

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.33)

Neural Information Processing SystemsFeb-11-2026, 15:27:04 GMT

Pitfalls of Epistemic Uncertainty Quantification through Loss Minimisation

Uncertainty quantification has received increasing attention in machine learning in the recent past. In particular, a distinction between aleatoric and epistemic uncertainty has been found useful in this regard. The latter refers to the learner's (lack of) knowledge and appears to be especially difficult to measure and quantify. In this paper, we analyse a recent proposal based on the idea of a second-order learner, which yields predictions in the form of distributions over probability distributions. While standard (first-order) learners can be trained to predict accurate probabilities, namely by minimising suitable loss functions on sample data, we show that loss minimisation does not work for second-order predictors: The loss functions proposed for inducing such predictors do not incentivise the learner to represent its epistemic uncertainty in a faithful way.

data mining, learner, machine learning, (17 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsOct-3-2025, 00:47:13 GMT

questions raised by each reviewer separately

We thank the reviewers for their close reading, detailed comments, and overall positive assessment. We will improve the flow and formatting of the paper, and fix the references in the final version. As we can see, ADE consistently achieves comparable or the best performance. We are exploring alternative sampling algorithm embeddings, e.g., ADE limitations and how to overcome. See Appendix C for details. ADE, then the parameter tuning requirements for ADE and GANs are comparable, i.e., we tune the inner optimization Re: "[the authors] further conduct T vanilla HMC steps to approximately solve it."

artificial intelligence, machine learning, sampler, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.33)

Neural Information Processing SystemsOct-2-2025, 21:42:09 GMT

Mixture weights optimisation for Alpha-Divergence Variational Inference

Bayesian Inference involves being able to compute or sample from the posterior density.

artificial intelligence, descent, machine learning, (14 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)

Neural Information Processing SystemsAug-18-2025, 08:07:40 GMT

Pitfalls of Epistemic Uncertainty Quantification through Loss Minimisation

data mining, learner, machine learning, (17 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJul-10-2025

Prevention of Overfitting on Mesh-Structured Data Regressions with a Modified Laplace Operator

Bigarella, Enda D. V.

This document reports on a method for detecting and preventing overfitting on data regressions, herein applied to mesh-like data structures. The mesh structure allows for the straightforward computation of the Laplace-operator second-order derivatives in a finite-difference fashion for noiseless data. Derivatives of the training data are computed on the original training mesh to serve as a true label of the entropy of the training data. Derivatives of the trained data are computed on a staggered mesh to identify oscillations in the interior of the original training mesh cells. The loss of the Laplace-operator derivatives is used for hyperparameter optimisation, achieving a reduction of unwanted oscillation through the minimisation of the entropy of the trained model. In this setup, testing does not require the splitting of points from the training data, and training is thus directly performed on all available training points. The Laplace operator applied to the trained data on a staggered mesh serves as a surrogate testing metric based on diffusion properties.

artificial intelligence, kernel, machine learning, (17 more...)

2507.06631

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry:

Transportation > Air (0.46)
Aerospace & Defense (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Neural Information Processing SystemsOct-11-2024, 12:42:59 GMT

Learning with Symmetric Label Noise: The Importance of Being Unhinged

Convex potential minimisation is the de facto approach to binary classification. However, Long and Servedio [2008] proved that under symmetric label noise (SLN), minimisation of any convex potential over a linear function class can result in classification performance equivalent to random guessing. This ostensibly shows that convex losses are not SLN-robust. In this paper, we propose a convex, classification-calibrated loss and prove that it is SLN-robust. The loss avoids the Long and Servedio [2008] result by virtue of being negatively unbounded.

sln-robust, symmetric label noise, unhinged, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.85)

Farzin, Amir Ali, Shames, Iman

Minimisation of Polyak-\L{}ojasewicz Functions Using Random Zeroth-Order Oracles

arXiv.org Artificial IntelligenceMay-15-2024

The application of a zeroth-order scheme for minimising Polyak-\L{}ojasewicz (PL) functions is considered. The framework is based on exploiting a random oracle to estimate the function gradient. The convergence of the algorithm to a global minimum in the unconstrained case and to a neighbourhood of the global minimum in the constrained case along with their corresponding complexity bounds are presented. The theoretical results are demonstrated via numerical examples.

algorithm, inequality, iteration, (16 more...)

2405.09106

Country:

Europe > Russia (0.04)
Europe > Italy (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Leofante, Francesco, Potyka, Nico

Promoting Counterfactual Robustness through Diversity

arXiv.org Artificial IntelligenceDec-12-2023

Counterfactual explanations shed light on the decisions of black-box models by explaining how an input can be altered to obtain a favourable decision from the model (e.g., when a loan application has been rejected). However, as noted recently, counterfactual explainers may lack robustness in the sense that a minor change in the input can cause a major change in the explanation. This can cause confusion on the user side and open the door for adversarial attacks. In this paper, we study some sources of non-robustness. While there are fundamental reasons for why an explainer that returns a single counterfactual cannot be robust in all instances, we show that some interesting robustness guarantees can be given by reporting multiple rather than a single counterfactual. Unfortunately, the number of counterfactuals that need to be reported for the theoretical guarantees to hold can be prohibitively large. We therefore propose an approximation algorithm that uses a diversity criterion to select a feasible number of most relevant explanations and study its robustness empirically. Our experiments indicate that our method improves the state-of-the-art in generating robust explanations, while maintaining other desirable properties and providing competitive computational performance.

artificial intelligence, counterfactual, machine learning, (16 more...)

2312.06564

Country: Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.82)

Industry:

Banking & Finance (0.48)
Health & Medicine (0.47)
Transportation (0.34)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Gruber, Nadja, Schwab, Johannes, Debroux, Noémie, Papadakis, Nicolas, Haltmeier, Markus

Single-Image based unsupervised joint segmentation and denoising

arXiv.org Artificial IntelligenceSep-19-2023

In this work, we develop an unsupervised method for the joint segmentation and denoising of a single image. To this end, we combine the advantages of a variational segmentation method with the power of a self-supervised, single-image based deep learning approach. One major strength of our method lies in the fact, that in contrast to data-driven methods, where huge amounts of labeled samples are necessary, our model can segment an image into multiple meaningful regions without any training database. Further, we introduce a novel energy functional in which denoising and segmentation are coupled in a way that both tasks benefit from each other. The limitations of existing single-image based variational segmentation methods, which are not capable of dealing with high noise or generic texture, are tackled by this specific combination with self-supervised image denoising. We propose a unified optimisation strategy and show that, especially for very noisy images available in microscopy, our proposed joint approach outperforms its sequential counterpart as well as alternative methods focused purely on denoising or segmentation. Another comparison is conducted with a supervised deep learning approach designed for the same application, highlighting the good performance of our approach.

artificial intelligence, machine learning, segmentation, (16 more...)

2309.10511

Country:

Europe > Austria > Tyrol > Innsbruck (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Greece > Ionian Islands > Corfu (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)