AITopics | ivon

Collaborating Authors

ivon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Variational Visual Question Answering for Uncertainty-Aware Selective Prediction

Wieczorek, Tobias Jan, Daun, Nathalie, Khan, Mohammad Emtiyaz, Rohrbach, Marcus

arXiv.org Artificial IntelligenceNov-3-2025

Despite remarkable progress in recent years, vision language models (VLMs) remain prone to overconfidence and hallucinations on tasks such as Visual Question Answering (VQA) and Visual Reasoning. Bayesian methods can potentially improve reliability by helping models selectively predict, that is, models respond only when they are sufficiently confident. Unfortunately, Bayesian methods are often assumed to be costly and ineffective for large models, and so far there exists little evidence to show otherwise, especially for multimodal applications. Here, we show the effectiveness and competitive edge of variational Bayes for selective prediction in VQA for the first time. We build on recent advances in variational methods for deep learning and propose an extension called "Variational VQA". This method improves calibration and yields significant gains for selective prediction on VQA and Visual Reasoning, particularly when the error tolerance is low ($\leq 1\%$). Often, just one posterior sample can yield more reliable answers than those obtained by models trained with AdamW. In addition, we propose a new risk-averse selector that outperforms standard sample averaging by considering the variance of predictions. Overall, we present compelling evidence that variational learning is a viable option to make large VLMs safer and more trustworthy.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2505.09591

Country: Asia > Japan (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Bayesian E(3)-Equivariant Interatomic Potential with Iterative Restratification of Many-body Message Passing

Willow, Soohaeng Yoo, Park, Tae Hyeon, Sim, Gi Beom, Moon, Sung Wook, Min, Seung Kyu, Yang, D. ChangMo, Kim, Hyun Woo, Lee, Juho, Myung, Chang Woo

arXiv.org Artificial IntelligenceOct-6-2025

Machine learning potentials (MLPs) have become essential for large-scale atomistic simulations, enabling ab initio-level accuracy with computational efficiency. However, current MLPs struggle with uncertainty quantification, limiting their reliability for active learning, calibration, and out-of-distribution (OOD) detection. We address these challenges by developing Bayesian E(3) equivariant MLPs with iterative restratification of many-body message passing. Our approach introduces the joint energy-force negative log-likelihood (NLL$_\text{JEF}$) loss function, which explicitly models uncertainty in both energies and interatomic forces, yielding superior accuracy compared to conventional NLL losses. We systematically benchmark multiple Bayesian approaches, including deep ensembles with mean-variance estimation, stochastic weight averaging Gaussian, improved variational online Newton, and laplace approximation by evaluating their performance on uncertainty prediction, OOD detection, calibration, and active learning tasks. We further demonstrate that NLL$_\text{JEF}$ facilitates efficient active learning by quantifying energy and force uncertainties. Using Bayesian active learning by disagreement (BALD), our framework outperforms random sampling and energy-uncertainty-based sampling. Our results demonstrate that Bayesian MLPs achieve competitive accuracy with state-of-the-art models while enabling uncertainty-guided active learning, OOD detection, and energy/forces calibration. This work establishes Bayesian equivariant neural networks as a powerful framework for developing uncertainty-aware MLPs for atomistic simulations at scale.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.03046

Country:

Asia > South Korea (0.46)
North America > United States (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Uncertainty quantification with approximate variational learning for wearable photoplethysmography prediction tasks

Bench, Ciaran, Desai, Vivek, Moulaeifard, Mohammad, Strodthoff, Nils, Aston, Philip, Thompson, Andrew

arXiv.org Artificial IntelligenceMay-19-2025

Photoplethysmography (PPG) signals encode information about relative changes in blood volume that can be used to assess various aspects of cardiac health non-invasively, e.g.\ to detect atrial fibrillation (AF) or predict blood pressure (BP). Deep networks are well-equipped to handle the large quantities of data acquired from wearable measurement devices. However, they lack interpretability and are prone to overfitting, leaving considerable risk for poor performance on unseen data and misdiagnosis. Here, we describe the use of two scalable uncertainty quantification techniques: Monte Carlo Dropout and the recently proposed Improved Variational Online Newton. These techniques are used to assess the trustworthiness of models trained to perform AF classification and BP regression from raw PPG time series. We find that the choice of hyperparameters has a considerable effect on the predictive performance of the models and on the quality and composition of predicted uncertainties. E.g. the stochasticity of the model parameter sampling determines the proportion of the total uncertainty that is aleatoric, and has varying effects on predictive performance and calibration quality dependent on the chosen uncertainty quantification technique and the chosen expression of uncertainty. We find significant discrepancy in the quality of uncertainties over the predicted classes, emphasising the need for a thorough evaluation protocol that assesses local and adaptive calibration. This work suggests that the choice of hyperparameters must be carefully tuned to balance predictive performance and calibration quality, and that the optimal parameterisation may vary depending on the chosen expression of uncertainty.

artificial intelligence, epistemic uncertainty, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2505.11412

Country:

Europe > Germany > Lower Saxony > Oldenburg (0.04)
North America > United States > Colorado (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Kuwait > Ahmadi Governorate > Al Ahmadi (0.04)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Variational Learning Induces Adaptive Label Smoothing

Yang, Sin-Han, Liu, Zhedong, Marconi, Gian Maria, Khan, Mohammad Emtiyaz

arXiv.org Artificial IntelligenceFeb-11-2025

We show that variational learning naturally induces an adaptive label smoothing where label noise is specialized for each example. Such label-smoothing is useful to handle examples with labeling errors and distribution shifts, but designing a good adaptivity strategy is not always easy. We propose to skip this step and simply use the natural adaptivity induced during the optimization of a variational objective. We show empirical results where a variational algorithm called IVON outperforms traditional label smoothing and yields adaptivity strategies similar to those of an existing approach. By connecting Bayesian methods to label smoothing, our work provides a new way to handle overconfident predictions.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.07273

Country:

Europe > France (0.04)
Europe > Austria (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(2 more...)

Add feedback

Variational Low-Rank Adaptation Using IVON

Cong, Bai, Daheim, Nico, Shen, Yuesong, Cremers, Daniel, Yokota, Rio, Khan, Mohammad Emtiyaz, Möllenhoff, Thomas

arXiv.org Machine LearningNov-9-2024

We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models.

ivon, large language model, machine learning, (16 more...)

arXiv.org Machine Learning

2411.04421

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A Bayesian Interpretation of Adaptive Low-Rank Adaptation

Chen, Haolin, Garner, Philip N.

arXiv.org Machine LearningSep-16-2024

Motivated by the sensitivity-based importance score of the adaptive low-rank adaptation (AdaLoRA), we utilize more theoretically supported metrics, including the signal-to-noise ratio (SNR), along with the Improved Variational Online Newton (IVON) optimizer, for adaptive parameter budget allocation. The resulting Bayesian counterpart not only has matched or surpassed the performance of using the sensitivity-based importance metric but is also a faster alternative to AdaLoRA with Adam. Our theoretical analysis reveals a significant connection between the two metrics, providing a Bayesian perspective on the efficacy of sensitivity as an importance score. Furthermore, our findings suggest that the magnitude, rather than the variance, is the primary indicator of the importance of parameters.

adalora, fine-tuning, sensitivity, (16 more...)

arXiv.org Machine Learning

2409.10673

Country:

Europe > Austria > Vienna (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Variational Learning is Effective for Large Deep Networks

Shen, Yuesong, Daheim, Nico, Cong, Bai, Nickl, Peter, Marconi, Gian Maria, Bazan, Clement, Yokota, Rio, Gurevych, Iryna, Cremers, Daniel, Khan, Mohammad Emtiyaz, Möllenhoff, Thomas

arXiv.org Machine LearningFeb-27-2024

Laplace (MacKay, 1992), which do not directly optimize the variational objective, even though they have variational We give extensive empirical evidence against the interpretations. Ideally, we want to know whether a direct common belief that variational learning is ineffective optimization of the objective can match the accuracy of for large neural networks. We show that Adam-like methods without any increase in the cost, while an optimizer called Improved Variational Online also yielding good weight-uncertainty to improve calibration, Newton (IVON) consistently matches or outperforms model averaging, knowledge transfer, etc. Adam for training large networks such as GPT-2 and ResNets from scratch. IVON's computational In this paper, we present the Improved Variational Online costs are nearly identical to Adam but Newton (IVON) method, which adapts the method of Lin its predictive uncertainty is better. We show several et al. (2020) to large scale and obtains state-of-the-art accuracy new use cases of IVON where we improve and uncertainty at nearly identical cost as Adam. Figure 1 fine-tuning and model merging in Large Language shows some examples where, for training GPT-2 (773M Models, accurately predict generalization error, parameters) from scratch, IVON gives 0.4 reduction in validation and faithfully estimate sensitivity to data. We find perplexity over AdamW and, for ResNet-50 (25.6M overwhelming evidence in support of effectiveness parameters) on ImageNet, it gives around 2% more accurate of variational learning.

ivon, learning, variational learning, (15 more...)

arXiv.org Machine Learning

2402.17641

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback