AITopics | minimiser

Collaborating Authors

minimiser

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings

Neural Information Processing SystemsJun-19-2026, 02:51:49 GMT

When this structure is known, at least locally, it can be exploited through reduction mappings that reparametrise part of the parameter space to lie on the solution manifold. These reductions naturally arise from inner optimisation problems and effectively remove redundant directions, yielding a lowerdimensional objective. In this work, we introduce a general framework to understand how such reductions influence the optimisation landscape. We show that well-designed reduction mappings improve curvature properties of the objective, leading to better-conditioned problems and theoretically faster convergence for gradient-based methods. Our analysis unifies a range of scenarios where structural information at optimality is leveraged to accelerate convergence, offering a principled explanation for the empirical gains observed in such optimisation algorithms.

artificial intelligence, machine learning, mapping, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

MAPEstimation with Denoisers: Convergence Rates and Guarantees

Neural Information Processing SystemsJun-17-2026, 17:24:01 GMT

Denoiser models have become powerful tools for inverse problems, enabling the use of pretrained networks to approximate the score of a smoothed prior distribution. These models are often used in heuristic iterative schemes aimed at solving Maximum a Posteriori (MAP) optimisation problems, where the proximal operator of the negative log-prior plays a central role. In practice, this operator is intractable, and practitioners plug in a pretrained denoiser as a surrogate--despite the lack of general theoretical justification for this substitution. In this work, we show that a simple algorithm, closely related to several used in practice, provably converges to the proximal operator under a log-concavity assumption on the prior p. We show that this algorithm can be interpreted as a gradient descent on smoothed proximal objectives. Our analysis thus provides a theoretical foundation for a class of empirically successful but previously heuristic methods.

artificial intelligence, inequality, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Soft Specialists: $α$-Rényi Ensembles for Uncertainty-Aware LLM Post-Training

Cordero-Encinar, Paula, Tyukin, Georgy, Duncan, Andrew B.

arXiv.org Machine LearningMay-28-2026

Existing training approaches for large language models learn a single set of parameters, based on large volumes of data, which is typically heterogeneous, conflicting and often outright contradictory. As a result, the model is forced to compress conflicting goals, and inherent uncertainties into a single, averaged pattern of behaviour. We propose an $α$-Rényi variational framework for learning distributions over post-training parameters, offering an uncertainty-aware alternative to deep ensemble approaches. The resulting variational objective interpolates between classical variational Bayes and predictively oriented posterior learning, balancing between globally plausible individual models against systems of complementary specialists. We identify local stability criteria, demonstrating how model misspecification can make non-degenerate posterior spread locally favourable, manifesting contradictory or conflicting data as epistemic uncertainty. We apply our framework to LLM post-training, learning an ensemble of LoRA adapters attached to a shared, frozen base model, providing a scalable training procedure for both supervised fine-tuning and preference optimisation. Our approach enables training examples to be softly routed across ensemble members, promoting model specialisation and providing actionable uncertainty estimates across different tasks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2605.27747

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

bc1d640f841f752c689aae20b31198c1-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 15:27:08 GMT

epistemic uncertainty, learner, prediction, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Pitfalls of Epistemic Uncertainty Quantification through Loss Minimisation

Neural Information Processing SystemsFeb-11-2026, 15:27:04 GMT

Uncertainty quantification has received increasing attention in machine learning in the recent past. In particular, a distinction between aleatoric and epistemic uncertainty has been found useful in this regard. The latter refers to the learner's (lack of) knowledge and appears to be especially difficult to measure and quantify. In this paper, we analyse a recent proposal based on the idea of a second-order learner, which yields predictions in the form of distributions over probability distributions. While standard (first-order) learners can be trained to predict accurate probabilities, namely by minimising suitable loss functions on sample data, we show that loss minimisation does not work for second-order predictors: The loss functions proposed for inducing such predictors do not incentivise the learner to represent its epistemic uncertainty in a faithful way.

data mining, learner, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings

Markou, Evan, Ajanthan, Thalaiyasingam, Gould, Stephen

arXiv.org Artificial IntelligenceOct-27-2025

Many high-dimensional optimisation problems exhibit rich geometric structures in their set of minimisers, often forming smooth manifolds due to over-parametrisation or symmetries. When this structure is known, at least locally, it can be exploited through reduction mappings that reparametrise part of the parameter space to lie on the solution manifold. These reductions naturally arise from inner optimisation problems and effectively remove redundant directions, yielding a lower-dimensional objective. In this work, we introduce a general framework to understand how such reductions influence the optimisation landscape. We show that well-designed reduction mappings improve curvature properties of the objective, leading to better-conditioned problems and theoretically faster convergence for gradient-based methods. Our analysis unifies a range of scenarios where structural information at optimality is leveraged to accelerate convergence, offering a principled explanation for the empirical gains observed in such optimisation algorithms.

artificial intelligence, machine learning, mapping, (17 more...)

arXiv.org Artificial Intelligence

2506.08428

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

Minimisation of Submodular Functions Using Gaussian Zeroth-Order Random Oracles

Farzin, Amir Ali, Pun, Yuen-Man, Braun, Philipp, Summers, Tyler, Shames, Iman

arXiv.org Artificial IntelligenceOct-20-2025

We consider the minimisation problem of submodular functions and investigate the application of a zeroth-order method to this problem. The method is based on exploiting a Gaussian smoothing random oracle to estimate the smoothed function gradient. We prove the convergence of the algorithm to a global $ε$-approximate solution in the offline case and show that the algorithm is Hannan-consistent in the online case with respect to static regret. Moreover, we show that the algorithm achieves $O(\sqrt{NP_N^\ast})$ dynamic regret, where $N$ is the number of iterations and $P_N^\ast$ is the path length. The complexity analysis and hyperparameter selection are presented for all the cases. The theoretical results are illustrated via numerical examples.

artificial intelligence, lov asz extension, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.15257

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Rates of Convergence of Generalised Variational Inference Posteriors under Prior Misspecification

Mildner, Terje, Giampouras, Paris, Damoulas, Theodoros

arXiv.org Machine LearningOct-6-2025

We prove rates of convergence and robustness to prior misspecification within a Generalised Variational Inference (GVI) framework with bounded divergences. This addresses a significant open challenge for GVI and Federated GVI that employ a different divergence to the Kullback--Leibler under prior misspecification, operate within a subset of possible probability measures, and result in intractable posteriors. Our theoretical contributions cover severe prior misspecification while relying on our ability to restrict the space of possible GVI posterior measures, and infer properties based on this space. In particular, we are able to establish sufficient conditions for existence and uniqueness of GVI posteriors on arbitrary Polish spaces, prove that the GVI posterior measure concentrates on a neighbourhood of loss minimisers, and extend this to rates of convergence regardless of the prior measure.

divergence, gvi posterior, posterior, (13 more...)

arXiv.org Machine Learning

2510.03109

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback

Filters

Collaborating Authors

minimiser

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings

MAPEstimation with Denoisers: Convergence Rates and Guarantees

Soft Specialists: $α$-Rényi Ensembles for Uncertainty-Aware LLM Post-Training

b444ad72520a5f5c467343be88e352ed-Paper-Conference.pdf

bc1d640f841f752c689aae20b31198c1-Supplemental-Conference.pdf

Pitfalls of Epistemic Uncertainty Quantification through Loss Minimisation

Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings

Minimisation of Submodular Functions Using Gaussian Zeroth-Order Random Oracles

b444ad72520a5f5c467343be88e352ed-Paper-Conference.pdf

Rates of Convergence of Generalised Variational Inference Posteriors under Prior Misspecification