AITopics

2505.12796

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sprunck, Tom, Pereyra, Marcelo, Liaudat, Tobias

Bayesian model selection and misspecification testing in imaging inverse problems only from noisy and partial measurements

arXiv.org Machine LearningNov-3-2025

Modern imaging techniques heavily rely on Bayesian statistical models to address difficult image reconstruction and restoration tasks. This paper addresses the objective evaluation of such models in settings where ground truth is unavailable, with a focus on model selection and misspecification diagnosis. Existing unsupervised model evaluation methods are often unsuitable for computational imaging due to their high computational cost and incompatibility with modern image priors defined implicitly via machine learning models. We herein propose a general methodology for unsupervised model selection and misspecification detection in Bayesian imaging sciences, based on a novel combination of Bayesian cross-validation and data fission, a randomized measurement splitting technique. The approach is compatible with any Bayesian imaging sampler, including diffusion and plug-and-play samplers. We demonstrate the methodology through experiments involving various scoring rules and types of model misspecification, where we achieve excellent selection and detection accuracy with a low computational cost.

artificial intelligence, machine learning, model selection, (16 more...)

2510.27663

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Li, Wenwen, Sanz-Alonso, Daniel, Yang, Ruiyi

Bayesian Optimization on Networks

arXiv.org Machine LearningNov-3-2025

This paper studies optimization on networks modeled as metric graphs. Motivated by applications where the objective function is expensive to evaluate or only available as a black box, we develop Bayesian optimization algorithms that sequentially update a Gaussian process surrogate model of the objective to guide the acquisition of query points. To ensure that the surrogates are tailored to the network's geometry, we adopt Whittle-Matérn Gaussian process prior models defined via stochastic partial differential equations on metric graphs. In addition to establishing regret bounds for optimizing sufficiently smooth objective functions, we analyze the practical case in which the smoothness of the objective is unknown and the Whittle-Matérn prior is represented using finite elements. Numerical results demonstrate the effectiveness of our algorithms for optimizing benchmark objective functions on a synthetic metric graph and for Bayesian inversion via maximum a posteriori estimation on a telecommunication network.

artificial intelligence, machine learning, optimization, (18 more...)

2510.27643

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.34)

Industry:

Telecommunications (0.55)
Health & Medicine > Pharmaceuticals & Biotechnology (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningOct-31-2025

A Unified Theory for Causal Inference: Direct Debiased Machine Learning via Bregman-Riesz Regression

Kato, Masahiro

This note introduces a unified theory for causal inference that integrates Riesz regression, covariate balancing, density-ratio estimation (DRE), targeted maximum likelihood estimation (TMLE), and the matching estimator in average treatment effect (ATE) estimation. In ATE estimation, the balancing weights and the regression functions of the outcome play important roles, where the balancing weights are referred to as the Riesz representer, bias-correction term, and clever covariates, depending on the context. Riesz regression, covariate balancing, DRE, and the matching estimator are methods for estimating the balancing weights, where Riesz regression is essentially equivalent to DRE in the ATE context, the matching estimator is a special case of DRE, and DRE is in a dual relationship with covariate balancing. TMLE is a method for constructing regression function estimators such that the leading bias term becomes zero. Nearest Neighbor Matching is equivalent to Least Squares Density Ratio Estimation and Riesz Regression.

artificial intelligence, bayesian inference, machine learning, (15 more...)

2510.26783

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models

Rahmati, Amir Hossein, Jantre, Sanket, Zhang, Weifeng, Wang, Yucheng, Yoon, Byung-Jun, Urban, Nathan M., Qian, Xiaoning

Low-Rank Adaptation (LoRA) offers a cost-effective solution for fine-tuning large language models (LLMs), but it often produces overconfident predictions in data-scarce few-shot settings. To address this issue, several classical statistical learning approaches have been repurposed for scalable uncertainty-aware LoRA fine-tuning. However, these approaches neglect how input characteristics affect the predictive uncertainty estimates. To address this limitation, we propose Contextual Low-Rank Adaptation (C-LoRA) as a novel uncertainty-aware and parameter efficient fine-tuning approach, by developing new lightweight LoRA modules contextualized to each input data sample to dynamically adapt uncertainty estimates. Incorporating data-driven contexts into the parameter posteriors, C-LoRA mitigates overfitting, achieves well-calibrated uncertainties, and yields robust predictions. Extensive experiments on LLaMA2-7B models demonstrate that C-LoRA consistently outperforms the state-of-the-art uncertainty-aware LoRA methods in both uncertainty quantification and model generalization. Ablation studies further confirm the critical role of our contextual modules in capturing sample-specific uncertainties. C-LoRA sets a new standard for robust, uncertainty-aware LLM fine-tuning in few-shot regimes. Although our experiments are limited to 7B models, our method is architecture-agnostic and, in principle, applies beyond this scale; studying its scaling to larger models remains an open problem. Our code is available at https://github.com/ahra99/c_lora.

large language model, machine learning, natural language, (18 more...)

2505.17773

Country:

Europe (0.92)
North America > United States > Texas (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

How Regularization Terms Make Invertible Neural Networks Bayesian Point Estimators

Heilenkötter, Nick

Whenever a quantity of interest cannot be observed directly but only through an indirect measurement process or in the presence of noise, one is faced with an inverse problem. To stabilize the reconstruction and mitigate the information loss inherent in the measurement, it is necessary to incorporate additional knowledge about the unknown data -- its prior distribution, which encodes what one expects the reconstruction to resemble, such as the characteristic features of natural images. Yet our ability to describe natural images in an explicit, algorithmic form remains quite limited. Fortunately, recent years have seen the emergence of data-driven approaches that enable the construction of priors directly from collections of representative samples. While these approaches often surpass classical methods in reconstruction quality, many of them lack theoretical guarantees and remain difficult to interpret. A promising direction explored recently [3, 4, 5, 21] involves invertible neural networks. Thanks to their bidirectional structure, a single network can simultaneously approximate the forward operator and serve as a reconstruction method, with stability ensured by the architecture itself. This hybrid use makes it possible to assess deviations from a known forward operator - or even replace it by a data-based version - while maintaining interpretability of the reconstruction process by the learned measurement model and vice versa. This dual capability is particularly relevant in applications where both high-fidelity reconstructions and a faithful representation of the measurement process are critical, such as scientific imaging and med-Preprint.

artificial intelligence, inverse problem, machine learning, (19 more...)

2510.26704

Country: Europe > Germany > Bremen > Bremen (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Amirzadeh, Rasoul, Thiruvady, Dhananjay, Shiri, Fatemeh

Bayesian Network Fusion of Large Language Models for Sentiment Analysis

Large language models (LLMs) continue to advance, with an increasing number of domain-specific variants tailored for specialised tasks. However, these models often lack transparency and explainability, can be costly to fine-tune, require substantial prompt engineering, yield inconsistent results across domains, and impose significant adverse environmental impact due to their high computational demands. To address these challenges, we propose the Bayesian network LLM fusion (BNLF) framework, which integrates predictions from three LLMs, including FinBERT, RoBERTa, and BERTweet, through a probabilistic mechanism for sentiment analysis. BNLF performs late fusion by modelling the sentiment predictions from multiple LLMs as probabilistic nodes within a Bayesian network. Evaluated across three human-annotated financial corpora with distinct linguistic and contextual characteristics, BNLF demonstrates consistent gains of about six percent in accuracy over the baseline LLMs, underscoring its robustness to dataset variability and the effectiveness of probabilistic fusion for interpretable sentiment classification.

large language model, machine learning, natural language, (19 more...)

2510.26484

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (1.00)
Banking & Finance (1.00)
Information Technology (0.93)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Estimating cognitive biases with attention-aware inverse planning

Banerjee, Sounak, Cornelisse, Daphne, Gopinath, Deepak, Sumner, Emily, DeCastro, Jonathan, Rosman, Guy, Vinitsky, Eugene, Ho, Mark K.

People's goal-directed behaviors are influenced by their cognitive biases, and autonomous systems that interact with people should be aware of this. For example, people's attention to objects in their environment will be biased in a way that systematically affects how they perform everyday tasks such as driving to work. Here, building on recent work in computational cognitive science, we formally articulate the attention-aware inverse planning problem, in which the goal is to estimate a person's attentional biases from their actions. We demonstrate how attention-aware inverse planning systematically differs from standard inverse reinforcement learning and how cognitive biases can be inferred from behavior. Finally, we present an approach to attention-aware inverse planning that combines deep reinforcement learning with computational cognitive modeling. We use this approach to infer the attentional strategies of RL agents in real-life driving scenarios selected from the Waymo Open Dataset, demonstrating the scalability of estimating cognitive biases with attention-aware inverse planning.

machine learning, reinforcement learning, simulation of human behavior, (19 more...)

2510.25951

Country: North America > United States > California (0.46)

Genre: Research Report (0.82)

Industry:

Automobiles & Trucks (1.00)
Information Technology (0.67)
Transportation > Ground > Road (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Rooijakkers, Joshua, Rønneberg, Leiv, Briol, François-Xavier, Knoblauch, Jeremias, Altamirano, Matias

Multi-Output Robust and Conjugate Gaussian Processes

arXiv.org Machine LearningOct-31-2025

Multi-output Gaussian process (MOGP) regression allows modelling dependencies among multiple correlated response variables. Similarly to standard Gaussian processes, MOGPs are sensitive to model misspecification and outliers, which can distort predictions within individual outputs. This situation can be further exacerbated by multiple anomalous response variables whose errors propagate due to correlations between outputs. To handle this situation, we extend and generalise the robust and conjugate Gaussian process (RCGP) framework introduced by Altamirano et al. (2024). This results in the multi-output RCGP (MO-RCGP): a provably robust MOGP that is conjugate, and jointly captures correlations across outputs. We thoroughly evaluate our approach through applications in finance and cancer research.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

2510.26401

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Banking & Finance (0.92)
Health & Medicine > Therapeutic Area > Oncology (0.87)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Ryu, J. Jon, Yeddanapudi, Pavan, Xu, Xiangxiang, Wornell, Gregory W.

Contrastive Predictive Coding Done Right for Mutual Information Estimation

arXiv.org Machine LearningOct-31-2025

The InfoNCE objective, originally introduced for contrastive representation learning, has become a popular choice for mutual information (MI) estimation, despite its indirect connection to MI. In this paper, we demonstrate why InfoNCE should not be regarded as a valid MI estimator, and we introduce a simple modification, which we refer to as InfoNCE-anchor, for accurate MI estimation. Our modification introduces an auxiliary anchor class, enabling consistent density ratio estimation and yielding a plug-in MI estimator with significantly reduced bias. Beyond this, we generalize our framework using proper scoring rules, which recover InfoNCE-anchor as a special case when the log score is employed. This formulation unifies a broad spectrum of contrastive objectives, including NCE, InfoNCE, and $f$-divergence variants, under a single principled framework. Empirically, we find that InfoNCE-anchor with the log score achieves the most accurate MI estimates; however, in self-supervised representation learning experiments, we find that the anchor does not improve the downstream task performance. These findings corroborate that contrastive representation learning benefits not from accurate MI estimation per se, but from the learning of structured density ratios.

artificial intelligence, machine learning, objective, (17 more...)

2510.25983

Country: North America > United States (0.67)

Genre: Research Report (0.63)

Industry: Law > Litigation (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)