AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Latent Variable Modeling for Robust Causal Effect Estimation

Morimura, Tetsuro, Oka, Tatsushi, Suzuki, Yugo, Moriwaki, Daisuke

arXiv.org Artificial IntelligenceAug-29-2025

Latent variable models provide a powerful framework for incorporating and inferring unobserved factors in observational data. In causal inference, they help account for hidden factors influencing treatment or outcome, thereby addressing challenges posed by missing or unmeasured covariates. This paper proposes a new framework that integrates latent variable modeling into the double machine learning (DML) paradigm to enable robust causal effect estimation in the presence of such hidden factors. We consider two scenarios: one where a latent variable affects only the outcome, and another where it may influence both treatment and outcome. To ensure tractability, we incorporate latent variables only in the second stage of DML, separating representation learning from latent inference. We demonstrate the robustness and effectiveness of our method through extensive experiments on both synthetic and real-world datasets.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2508.20259

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Marketing (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

A Unified Theory of Language

Worden, Robert

arXiv.org Artificial IntelligenceAug-29-2025

A unified theory of language combines a Bayesian cognitive linguistic model of language processing, with the proposal that language evolved by sexual selection for the display of intelligence. The theory accounts for the major facts of language, including its speed and expressivity, and data on language diversity, pragmatics, syntax and semantics. The computational element of the theory is based on Construction Grammars. These give an account of the syntax and semantics of the worlds languages, using constructions and unification. Two novel elements are added to construction grammars: an account of language pragmatics, and an account of fast, precise language learning. Constructions are represented in the mind as graph like feature structures. People use slow general inference to understand the first few examples they hear of any construction. After that it is learned as a feature structure, and is rapidly applied by unification. All aspects of language (phonology, syntax, semantics, and pragmatics) are seamlessly computed by fast unification; there is no boundary between semantics and pragmatics. This accounts for the major puzzles of pragmatics, and for detailed pragmatic phenomena. Unification is Bayesian maximum likelihood pattern matching. This gives evolutionary continuity between language processing in the human brain, and Bayesian cognition in animal brains. Language is the basis of our mind reading abilities, our cooperation, self esteem and emotions; the foundations of human culture and society.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.20109

Country:

Europe > United Kingdom > England (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)

Genre:

Research Report (0.81)
Overview (0.67)

Industry:

Health & Medicine > Consumer Health (0.67)
Leisure & Entertainment (0.67)
Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.92)

Add feedback

Unfolding AlphaFold's Bayesian Roots in Probability Kinematics

Hamelryck, Thomas, Mardia, Kanti V.

arXiv.org Artificial IntelligenceAug-28-2025

We present a novel theoretical interpretation of AlphaFold1 that reveals the potential of generalized Bayesian updating for probabilistic deep learning. The seminal breakthrough of AlphaFold1 in protein structure prediction by deep learning relied on a learned potential energy function, in contrast to the later end-to-end architectures of AlphaFold2 and AlphaFold3. While this potential was originally justified by referring to physical potentials of mean force (PMFs), we reinterpret AlphaFold1's potential as an instance of {\em probability kinematics} -- also known as {\em Jeffrey conditioning} -- a principled but under-recognised generalization of conventional Bayesian updating. Probability kinematics accommodates uncertain or {\em soft} evidence in the form of updated probabilities over a partition. This perspective reveals AlphaFold1's potential as a form of generalized Bayesian updating, rather than a thermodynamic potential. To confirm our probabilistic framework's scope and precision, we analyze a synthetic 2D model in which an angular random walk prior is updated with evidence on distances via probability kinematics, mirroring AlphaFold1's approach. This theoretical contribution connects AlphaFold1 to a broader class of well-justified Bayesian methods, allowing precise quantification, surpassing merely qualitative heuristics based on PMFs. Our contribution is theoretical: we replace AlphaFold1's heuristic analogy with a principled probabilistic framework, tested in a controlled synthetic setting where correctness can be assessed. More broadly, our results point to the considerable promise of probability kinematics for probabilistic deep learning, by allowing the formulation of complex models from a few simpler components.

artificial intelligence, dihedral angle, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2505.19763

Country:

Europe > United Kingdom > England (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Active Query Selection for Crowd-Based Reinforcement Learning

Erskine, Jonathan, Yamagata, Taku, Santos-Rodríguez, Raúl

arXiv.org Artificial IntelligenceAug-27-2025

Preference-based reinforcement learning has gained prominence as a strategy for training agents in environments where the reward signal is difficult to specify or misaligned with human intent. However, its effectiveness is often limited by the high cost and low availability of reliable human input, especially in domains where expert feedback is scarce or errors are costly. To address this, we propose a novel framework that combines two complementary strategies: probabilistic crowd modelling to handle noisy, multi-annotator feedback, and active learning to prioritize feedback on the most informative agent actions. We extend the Advise algorithm to support multiple trainers, estimate their reliability online, and incorporate entropy-based query selection to guide feedback requests. We evaluate our approach in a set of environments that span both synthetic and real-world-inspired settings, including 2D games (Taxi, Pacman, Frozen Lake) and a blood glucose control task for Type 1 Diabetes using the clinically approved UVA/Padova simulator. Our preliminary results demonstrate that agents trained with feedback on uncertain trajectories exhibit faster learning in most tasks, and we outperform the baselines for the blood glucose control task.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2508.19132

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

CausalMACE: Causality Empowered Multi-Agents in Minecraft Cooperative Tasks

Chai, Qi, Zheng, Zhang, Ren, Junlong, Ye, Deheng, Lin, Zichuan, Wang, Hao

arXiv.org Artificial IntelligenceAug-27-2025

Minecraft, as an open-world virtual interactive environment, has become a prominent platform for research on agent decision-making and execution. Existing works primarily adopt a single Large Language Model (LLM) agent to complete various in-game tasks. However, for complex tasks requiring lengthy sequences of actions, single-agent approaches often face challenges related to inefficiency and limited fault tolerance. Despite these issues, research on multi-agent collaboration remains scarce. In this paper, we propose CausalMACE, a holistic causality planning framework designed to enhance multi-agent systems, in which we incorporate causality to manage dependencies among subtasks. Technically, our proposed framework introduces two modules: an overarching task graph for global task planning and a causality-based module for dependency management, where inherent rules are adopted to perform causal intervention. Experimental results demonstrate our approach achieves state-of-the-art performance in multi-agent cooperative tasks of Minecraft.

agent, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.18797

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Games > Computer Games (0.93)
Materials > Metals & Mining (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A Novel Framework for Uncertainty Quantification via Proper Scores for Classification and Beyond

Gruber, Sebastian G.

arXiv.org Machine LearningAug-26-2025

In this PhD thesis, we propose a novel framework for uncertainty quantification in machine learning, which is based on proper scores. Uncertainty quantification is an important cornerstone for trustworthy and reliable machine learning applications in practice. Usually, approaches to uncertainty quantification are problem-specific, and solutions and insights cannot be readily transferred from one task to another. Proper scores are loss functions minimized by predicting the target distribution. Due to their very general definition, proper scores apply to regression, classification, or even generative modeling tasks. We contribute several theoretical results, that connect epistemic uncertainty, aleatoric uncertainty, and model calibration with proper scores, resulting in a general and widely applicable framework. We achieve this by introducing a general bias-variance decomposition for strictly proper scores via functional Bregman divergences. Specifically, we use the kernel score, a kernel-based proper score, for evaluating sample-based generative models in various domains, like image, audio, and natural language generation. This includes a novel approach for uncertainty estimation of large language models, which outperforms state-of-the-art baselines. Further, we generalize the calibration-sharpness decomposition beyond classification, which motivates the definition of proper calibration errors. We then introduce a novel estimator for proper calibration errors in classification, and a novel risk-based approach to compare different estimators for squared calibration errors. Last, we offer a decomposition of the kernel spherical score, another kernel-based proper score, allowing a more fine-grained and interpretable evaluation of generative image models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

doi: 10.21248/gups.93204

2508.18001

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Minnesota (0.04)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
(6 more...)

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.45)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.45)
Energy > Power Industry (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Multidimensional Distributional Neural Network Output Demonstrated in Super-Resolution of Surface Wind Speed

Goldwyn, Harrison J., Krock, Mitchell, Rudi, Johann, Getter, Daniel, Bessac, Julie

arXiv.org Machine LearningAug-26-2025

Accurate quantification of uncertainty in neural network predictions remains a central challenge for scientific applications involving high-dimensional, correlated data. While existing methods capture either aleatoric or epistemic uncertainty, few offer closed-form, multidimensional distributions that preserve spatial correlation while remaining computationally tractable. In this work, we present a framework for training neural networks with a multidimensional Gaussian loss, generating closed-form predictive distributions over outputs with non-identically distributed and heteroscedastic structure. Our approach captures aleatoric uncertainty by iteratively estimating the means and covariance matrices, and is demonstrated on a super-resolution example. We leverage a Fourier representation of the covariance matrix to stabilize network training and preserve spatial correlation. We introduce a novel regularization strategy -- referred to as information sharing -- that interpolates between image-specific and global covariance estimates, enabling convergence of the super-resolution downscaling network trained on image-specific distributional loss functions. This framework allows for efficient sampling, explicit correlation modeling, and extensions to more complex distribution families all without disrupting prediction performance. We demonstrate the method on a surface wind speed downscaling task and discuss its broader applicability to uncertainty-aware prediction in scientific models.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Machine Learning

2508.16686

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California (0.14)
Asia > Turkmenistan > Ahal Region > Anau (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Algebraic Approach to Ridge-Regularized Mean Squared Error Minimization in Minimal ReLU Neural Network

Fukasaku, Ryoya, Kabata, Yutaro, Okuno, Akifumi

arXiv.org Machine LearningAug-26-2025

This paper investigates a perceptron, a simple neural network model, with ReLU activation and a ridge-regularized mean squared error (RR-MSE). Our approach leverages the fact that the RR-MSE for ReLU perceptron is piecewise polynomial, enabling a systematic analysis using tools from computational algebra. In particular, we develop a Divide-Enumerate-Merge strategy that exhaustively enumerates all local minima of the RR-MSE. By virtue of the algebraic formulation, our approach can identify not only the typical zero-dimensional minima (i.e., isolated points) obtained by numerical optimization, but also higher-dimensional minima (i.e., connected sets such as curves, surfaces, or hypersurfaces). Although computational algebraic methods are computationally very intensive for perceptrons of practical size, as a proof of concept, we apply the proposed approach in practice to minimal perceptrons with a few hidden units.

artificial intelligence, local minima, machine learning, (18 more...)

arXiv.org Machine Learning

2508.17783

Country:

Asia > Japan > Kyūshū & Okinawa > Kyūshū > Kagoshima Prefecture > Kagoshima (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Who Wins the Race? (R Vs Python) - An Exploratory Study on Energy Consumption of Machine Learning Algorithms

Chattaraj, Rajrupa, Chimalakonda, Sridhar, Sharma, Vibhu Saujanya, Kaulgud, Vikrant

arXiv.org Artificial IntelligenceAug-26-2025

The utilization of Machine Learning (ML) in contemporary software systems is extensive and continually expanding. However, its usage is energy-intensive, contributing to increased carbon emissions and demanding significant resources. While numerous studies examine the performance and accuracy of ML, only a limited few focus on its environmental aspects, particularly energy consumption. In addition, despite emerging efforts to compare energy consumption across various programming languages for specific algorithms and tasks, there remains a gap specifically in comparing these languages for ML-based tasks. This paper aims to raise awareness of the energy costs associated with employing different programming languages for ML model training and inference. Through this empirical study, we measure and compare the energy consumption along with run-time performance of five regression and five classification tasks implemented in Python and R, the two most popular programming languages in this context. Our study results reveal a statistically significant difference in costs between the two languages in 95% of the cases examined. Furthermore, our analysis demonstrates that the choice of programming language can influence energy efficiency significantly, up to 99.16% during model training and up to 99.8% during inferences, for a given ML task.

artificial intelligence, energy consumption, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.17344

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Energy (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
(2 more...)

Add feedback

Introduction to Regularization and Learning Methods for Inverse Problems

Bednarski, Danielle, Roith, Tim

arXiv.org Artificial IntelligenceAug-26-2025

These lecture notes evolve around mathematical concepts arising in inverse problems. We start by introducing inverse problems through examples such as differentiation, deconvolution, computed tomography and phase retrieval. This then leads us to the framework of well-posedness and first considerations regarding reconstruction and inversion approaches. The second chapter then first deals with classical regularization theory of inverse problems in Hilbert spaces. After introducing the pseudo-inverse, we review the concept of convergent regularization. Within this chapter we then proceed to ask the question of how to realize practical reconstruction algorithms. Here, we mainly focus on Tikhonov and sparsity promoting regularization in finite dimensional spaces. In the third chapter, we dive into modern deep-learning methods, which allow solving inverse problems in a data-dependent approach. The intersection between inverse problems and machine learning is a rapidly growing field and our exposition here restricts itself to a very limited selection of topics. Among them are learned regularization, fully-learned Bayesian estimation, post-processing strategies and plug-n-play methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.18178

Country:

Europe > Germany (0.45)
Europe > United Kingdom > England (0.27)

Genre:

Overview (0.87)
Instructional Material > Course Syllabus & Notes (0.68)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback