AITopics

2506.11898

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Industry: Education > Educational Setting > Online (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Choi, Chloe H., Zanoni, Andrea, Schiavazzi, Daniele E., Marsden, Alison L.

On the performance of multi-fidelity and reduced-dimensional neural emulators for inference of physiologic boundary conditions

arXiv.org Machine LearningJun-16-2025

Solving inverse problems in cardiovascular modeling is particularly challenging due to the high computational cost of running high-fidelity simulations. In this work, we focus on Bayesian parameter estimation and explore different methods to reduce the computational cost of sampling from the posterior distribution by leveraging low-fidelity approximations. A common approach is to construct a surrogate model for the high-fidelity simulation itself. Another is to build a surrogate for the discrepancy between high- and low-fidelity models. This discrepancy, which is often easier to approximate, is modeled with either a fully connected neural network or a nonlinear dimensionality reduction technique that enables surrogate construction in a lower-dimensional space. A third possible approach is to treat the discrepancy between the high-fidelity and surrogate models as random noise and estimate its distribution using normalizing flows. This allows us to incorporate the approximation error into the Bayesian inverse problem by modifying the likelihood function. We validate five different methods which are variations of the above on analytical test cases by comparing them to posterior distributions derived solely from high-fidelity models, assessing both accuracy and computational cost. Finally, we demonstrate our approaches on two cardiovascular examples of increasing complexity: a lumped-parameter Windkessel model and a patient-specific three-dimensional anatomy.

artificial intelligence, machine learning, posterior distribution, (18 more...)

2506.11683

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Schönlieb, Carola-Bibiane, Shumaylov, Zakhar

Data-driven approaches to inverse problems

arXiv.org Artificial IntelligenceJun-16-2025

Inverse problems are concerned with the reconstruction of unknown physical quantities using indirect measurements and are fundamental across diverse fields such as medical imaging, remote sensing, and material sciences. These problems serve as critical tools for visualizing internal structures beyond what is visible to the naked eye, enabling quantification, diagnosis, prediction, and discovery. However, most inverse problems are ill-posed, necessitating robust mathematical treatment to yield meaningful solutions. While classical approaches provide mathematically rigorous and computationally stable solutions, they are constrained by the ability to accurately model solution properties and implement them efficiently. A more recent paradigm considers deriving solutions to inverse problems in a data-driven manner. Instead of relying on classical mathematical modeling, this approach utilizes highly over-parameterized models, typically deep neural networks, which are adapted to specific inverse problems using carefully selected training data. Current approaches that follow this new paradigm distinguish themselves through solution accuracy paired with computational efficiency that was previously inconceivable. These notes offer an introduction to this data-driven paradigm for inverse problems. The first part of these notes will provide an introduction to inverse problems, discuss classical solution strategies, and present some applications. The second part will delve into modern data-driven approaches, with a particular focus on adversarial regularization and provably convergent linear plug-and-play denoisers. Throughout the presentation of these methodologies, their theoretical properties will be discussed, and numerical examples will be provided. The lecture series will conclude with a discussion of open problems and future perspectives in the field.

artificial intelligence, inverse problem, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.11732

Country:

Europe > United Kingdom > England (0.28)
North America > United States > California (0.27)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Energy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(5 more...)

arXiv.org Artificial IntelligenceJun-16-2025

Robust Filtering -- Novel Statistical Learning and Inference Algorithms with Applications

Chughtai, Aamir Hussain

State estimation or filtering serves as a fundamental task to enable intelligent decision-making in applications such as autonomous vehicles, robotics, healthcare monitoring, smart grids, intelligent transportation, and predictive maintenance. Standard filtering assumes prior knowledge of noise statistics to extract latent system states from noisy sensor data. However, real-world scenarios involve abnormalities like outliers, biases, drifts, and missing observations with unknown or partially known statistics, limiting conventional approaches. This thesis presents novel robust nonlinear filtering methods to mitigate these challenges. Based on insights from our filtering proposals, we extend the formulations to offline estimation/learning setups and propose smoothing extensions. Our methods leverage Bayesian inference frameworks, employing both deterministic and stochastic approximation techniques including Variational Inference (VI) and Particle Filters/Sequential Monte Carlo (SMC). We also study theoretical estimation limits using Bayesian Cramér-Rao bounds (BCRBs) in the context of measurement abnormalities. To validate the performance gains of the proposed methods, we perform simulations and experiments in scenarios including target tracking, indoor localization, 3D point cloud registration, mesh registration, and pose graph optimization. The fundamental nature of the work makes it useful in diverse applications, with possible future extensions toward developing outlier-robust machine learning pipelines, learning system dynamics from anomalous data, and addressing challenges in generative AI where standard diffusion models struggle with outliers, imbalanced datasets, and mode collapse.

artificial intelligence, gaussian distribution, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.1153

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Education (1.00)
Government > Military (0.92)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kumagai, Atsutoshi, Iwata, Tomoharu, Nishiyama, Taishi, Ida, Yasutoshi, Fujiwara, Yasuhiro

Meta-learning Representations for Learning from Multiple Annotators

We propose a meta-learning method for learning from multiple noisy annotators. In many applications such as crowdsourcing services, labels for supervised learning are given by multiple annotators. Since the annotators have different skills or biases, given labels can be noisy. To learn accurate classifiers, existing methods require many noisy annotated data. However, sufficient data might be unavailable in practice. To overcome the lack of data, the proposed method uses labeled data obtained in different but related tasks. The proposed method embeds each example in tasks to a latent space by using a neural network and constructs a probabilistic model for learning a task-specific classifier while estimating annotators' abilities on the latent space. This neural network is meta-learned to improve the expected test classification performance when the classifier is adapted to a given small amount of annotated data. This classifier adaptation is performed by maximizing the posterior probability via the expectation-maximization (EM) algorithm. Since each step in the EM algorithm is easily computed as a closed-form and is differentiable, the proposed method can efficiently backpropagate the loss through the EM algorithm to meta-learn the neural network. We show the effectiveness of our method with real-world datasets with synthetic noise and real-world crowdsourcing datasets.

annotator, artificial intelligence, machine learning, (16 more...)

2506.10259

Country:

Asia > Middle East > Jordan (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)

Arpogaus, Marcel, Kneib, Thomas, Nagler, Thomas, Rügamer, David

Hybrid Bernstein Normalizing Flows for Flexible Multivariate Density Regression with Interpretable Marginals

Density regression models allow a comprehensive understanding of data by modeling the complete conditional probability distribution. While flexible estimation approaches such as normalizing flows (NF) work particularly well in multiple dimensions, interpreting the input-output relationship of such models is often difficult, due to the black-box character of deep learning models. In contrast, existing statistical methods for multivariate outcomes such as multivariate conditional transformation models (MCTM) are restricted in flexibility and are often not expressive enough to represent complex multivariate probability distributions. In this paper, we combine MCTM with state-of-the-art and autoregressive NF to leverage the transparency of MCTM for modeling interpretable feature effects on the marginal distributions in the first step and the flexibility of neural-network-based NF techniques to account for complex and non-linear relationships in the joint data distribution. We demonstrate our method's versatility in various numerical experiments and compare it with MCTM and other NF models on both simulated and real-world data.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2505.14164

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Europe > Germany > Lower Saxony > Gottingen (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Sanokowski, Sebastian, Gruber, Lukas, Bartmann, Christoph, Hochreiter, Sepp, Lehner, Sebastian

Rethinking Losses for Diffusion Bridge Samplers

Diffusion bridges are a promising class of deep-learning methods for sampling from unnormalized distributions. Recent works show that the Log Variance (LV) loss consistently outperforms the reverse Kullback-Leibler (rKL) loss when using the reparametrization trick to compute rKL-gradients. While the on-policy LV loss yields identical gradients to the rKL loss when combined with the log-derivative trick for diffusion samplers with non-learnable forward processes, this equivalence does not hold for diffusion bridges or when diffusion coefficients are learned. Based on this insight we argue that for diffusion bridges the LV loss does not represent an optimization objective that can be motivated like the rKL loss via the data processing inequality. Our analysis shows that employing the rKL loss with the log-derivative trick (rKL-LD) does not only avoid these conceptual problems but also consistently outperforms the LV loss. Experimental results with different types of diffusion bridges on challenging benchmarks show that samplers trained with the rKL-LD loss achieve better performance. From a practical perspective we find that rKL-LD requires significantly less hyperparameter optimization and yields more stable training behavior.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2506.10982

Country:

Europe > Austria > Upper Austria > Linz (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Textual Bayes: Quantifying Uncertainty in LLM-Based Systems

Ross, Brendan Leigh, Vouitsis, Noël, Ghomi, Atiyeh Ashari, Hosseinzadeh, Rasa, Xin, Ji, Liu, Zhaoyan, Sui, Yi, Hou, Shiyi, Leung, Kin Kwan, Loaiza-Ganem, Gabriel, Cresswell, Jesse C.

Although large language models (LLMs) are becoming increasingly capable of solving challenging real-world tasks, accurately quantifying their uncertainty remains a critical open problem, which limits their applicability in high-stakes domains. This challenge is further compounded by the closed-source, black-box nature of many state-of-the-art LLMs. Moreover, LLM-based systems can be highly sensitive to the prompts that bind them together, which often require significant manual tuning (i.e., prompt engineering). In this work, we address these challenges by viewing LLM-based systems through a Bayesian lens. We interpret prompts as textual parameters in a statistical model, allowing us to use a small training dataset to perform Bayesian inference over these prompts. This novel perspective enables principled uncertainty quantification over both the model's textual parameters and its downstream predictions, while also incorporating prior beliefs about these parameters expressed in free-form text. To perform Bayesian inference, a difficult problem even for well-studied data modalities, we introduce Metropolis-Hastings through LLM Proposals (MHLP), a novel Markov chain Monte Carlo (MCMC) algorithm that combines prompt optimization techniques with standard MCMC methods. MHLP is a turnkey modification to existing LLM pipelines, including those that rely exclusively on closed-source models. Empirically, we demonstrate that our method yields improvements in both predictive accuracy and uncertainty quantification (UQ) on a range of LLM benchmarks and UQ tasks. More broadly, our work demonstrates a viable path for incorporating methods from the rich Bayesian literature into the era of LLMs, paving the way for more reliable and calibrated LLM-based systems.

large language model, machine learning, natural language, (16 more...)

2506.1006

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Northumberland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Instructional Material > Course Syllabus & Notes (0.67)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Haddouche, Maxime, Sakhi, Otmane

Logarithmic Smoothing for Adaptive PAC-Bayesian Off-Policy Learning

Off-policy learning serves as the primary framework for learning optimal policies from logged interactions collected under a static behavior policy. In this work, we investigate the more practical and flexible setting of adaptive off-policy learning, where policies are iteratively refined and re-deployed to collect higher-quality data. Building on the success of PAC-Bayesian learning with Logarithmic Smoothing (LS) in static settings, we extend this framework to the adaptive scenario using tools from online PAC-Bayesian theory. Furthermore, we demonstrate that a principled adjustment to the LS estimator naturally accommodates multiple rounds of deployment and yields faster convergence rates under mild conditions. Our method matches the performance of leading offline approaches in static settings, and significantly outperforms them when intermediate policy deployments are allowed. Empirical evaluations across diverse scenarios highlight both the advantages of adaptive data collection and the strength of the PAC-Bayesian formulation.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2506.10664

Country:

North America > United States (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Ferrara, Antonio, Cozzi, Francesco, Perotti, Alan, Panisson, André, Bonchi, Francesco

Size-adaptive Hypothesis Testing for Fairness

Determining whether an algorithmic decision-making system discriminates against a specific demographic typically involves comparing a single point estimate of a fairness metric against a predefined threshold. This practice is statistically brittle: it ignores sampling error and treats small demographic subgroups the same as large ones. The problem intensifies in intersectional analyses, where multiple sensitive attributes are considered jointly, giving rise to a larger number of smaller groups. As these groups become more granular, the data representing them becomes too sparse for reliable estimation, and fairness metrics yield excessively wide confidence intervals, precluding meaningful conclusions about potential unfair treatments. In this paper, we introduce a unified, size-adaptive, hypothesis-testing framework that turns fairness assessment into an evidence-based statistical decision. Our contribution is twofold. (i) For sufficiently large subgroups, we prove a Central-Limit result for the statistical parity difference, leading to analytic confidence intervals and a Wald test whose type-I (false positive) error is guaranteed at level $α$. (ii) For the long tail of small intersectional groups, we derive a fully Bayesian Dirichlet-multinomial estimator; Monte-Carlo credible intervals are calibrated for any sample size and naturally converge to Wald intervals as more data becomes available. We validate our approach empirically on benchmark datasets, demonstrating how our tests provide interpretable, statistically rigorous decisions under varying degrees of data availability and intersectionality.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2506.10586

Country:

North America > United States (0.14)
Europe > Italy > Piedmont > Turin Province > Turin (0.05)
Europe > Austria > Styria > Graz (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Law (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)