AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Reliable training and estimation of variance networks

Nicki Skafte, Martin Jørgensen, Søren Hauberg

Neural Information Processing SystemsOct-2-2025, 00:27:39 GMT

We propose and investigate new complementary methodologies for estimating predictive variance networks in regression neural networks.

artificial intelligence, machine learning, variance, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Statistical Model Criticism using Kernel Two Sample Tests

James R. Lloyd, Zoubin Ghahramani

Neural Information Processing SystemsOct-2-2025, 00:27:15 GMT

Neural Information Processing Systems http://nips.cc/

discrepancy, model criticism, witness function, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
(2 more...)

Add feedback

Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections

Raanan Yehezkel Rohekar, Yaniv Gurwicz, Shami Nisimov, Gal Novik

Neural Information Processing SystemsOct-2-2025, 00:16:19 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

How Does the Pretraining Distribution Shape In-Context Learning? Task Selection, Generalization, and Robustness

Azizian, Waïss, Hasan, Ali

arXiv.org Machine LearningOct-2-2025

The emergence of in-context learning (ICL) in large language models (LLMs) remains poorly understood despite its consistent effectiveness, enabling models to adapt to new tasks from only a handful of examples. To clarify and improve these capabilities, we characterize how the statistical properties of the pretraining distribution (e.g., tail behavior, coverage) shape ICL on numerical tasks. We develop a theoretical framework that unifies task selection and generalization, extending and sharpening earlier results, and show how distributional properties govern sample efficiency, task retrieval, and robustness. To this end, we generalize Bayesian posterior consistency and concentration results to heavy-tailed priors and dependent sequences, better reflecting the structure of LLM pretraining data. We then empirically study how ICL performance varies with the pretraining distribution on challenging tasks such as stochastic differential equations and stochastic processes with memory. Together, these findings suggest that controlling key statistical properties of the pretraining distribution is essential for building ICL-capable and reliable LLMs.

icl, international conference, transformer, (12 more...)

arXiv.org Machine Learning

2510.01163

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
(2 more...)

Add feedback

False Discovery Rate Control via Bayesian Mirror Statistic

Molinari, Marco, Thoresen, Magne

arXiv.org Machine LearningOct-2-2025

Simultaneously performing variable selection and inference in high-dimensional models is an open challenge in statistics and machine learning. The increasing availability of vast amounts of variables requires the adoption of specific statistical procedures to accurately select the most important predictors in a high-dimensional space, while being able to control some form of selection error. In this work we adapt the Mirror Statistic approach to False Discovery Rate (FDR) control into a Bayesian modelling framework. The Mirror Statistic, developed in the classic frequentist statistical framework, is a flexible method to control FDR, which only requires mild model assumptions, but requires two sets of independent regression coefficient estimates, usually obtained after splitting the original dataset. Here we propose to rely on a Bayesian formulation of the model and use the posterior distributions of the coefficients of interest to build the Mirror Statistic and effectively control the FDR without the need to split the data. Moreover, the method is very flexible since it can be used with continuous and discrete outcomes and more complex predictors, such as with mixed models. We keep the approach scalable to high-dimensions by relying on Automatic Differentiation Variational Inference and fully continuous prior choices.

coefficient, covariate, mirror statistic, (14 more...)

arXiv.org Machine Learning

2510.00875

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Education > Health & Safety > School Nutrition (0.46)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Add feedback

Bayesian Neural Networks for Functional ANOVA model

Park, Seokhun, Kim, Choeun, Lee, Jihu, Shin, Yunseop, Kong, Insung, Kim, Yongdai

arXiv.org Machine LearningOct-2-2025

With the increasing demand for interpretability in machine learning, functional ANOVA decomposition has gained renewed attention as a principled tool for breaking down high-dimensional function into low-dimensional components that reveal the contributions of different variable groups. Recently, Tensor Product Neural Network (TPNN) has been developed and applied as basis functions in the functional ANOVA model, referred to as ANOVA-TPNN. A disadvantage of ANOVA-TPNN, however, is that the components to be estimated must be specified in advance, which makes it difficult to incorporate higher-order TPNNs into the functional ANOVA model due to computational and memory constraints. In this work, we propose Bayesian-TPNN, a Bayesian inference procedure for the functional ANOVA model with TPNN basis functions, enabling the detection of higher-order components with reduced computational cost compared to ANOVA-TPNN. We develop an efficient MCMC algorithm and demonstrate that Bayesian-TPNN performs well by analyzing multiple benchmark datasets. Theoretically, we prove that the posterior of Bayesian-TPNN is consistent.

bayesian-tpnn, dataset, neural network, (13 more...)

arXiv.org Machine Learning

2510.00545

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Wisconsin (0.04)
Europe > Netherlands (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Approximately Unimodal Likelihood Models for Ordinal Regression

Yamasaki, Ryoya

arXiv.org Machine LearningOct-2-2025

Ordinal regression (OR, also called ordinal classification) is classification of ordinal data, in which the underlying target variable is categorical and considered to have a natural ordinal relation for the underlying explanatory variable. A key to successful OR models is to find a data structure `natural ordinal relation' common to many ordinal data and reflect that structure into the design of those models. A recent OR study found that many real-world ordinal data show a tendency that the conditional probability distribution (CPD) of the target variable given a value of the explanatory variable will often be unimodal. Several previous studies thus developed unimodal likelihood models, in which a predicted CPD is guaranteed to become unimodal. However, it was also observed experimentally that many real-world ordinal data partly have values of the explanatory variable where the underlying CPD will be non-unimodal, and hence unimodal likelihood models may suffer from a bias for such a CPD. Therefore, motivated to mitigate such a bias, we propose approximately unimodal likelihood models, which can represent up to a unimodal CPD and a CPD that is close to be unimodal. We also verify experimentally that a proposed model can be effective for statistical modeling of ordinal data and OR tasks.

likelihood model, ordinal data, unimodal likelihood model, (13 more...)

arXiv.org Machine Learning

2510.00122

Country:

Oceania > Australia > Tasmania (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Indian Ocean > Bass Strait (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Structural Refinement of Bayesian Networks for Efficient Model Parameterisation

Drury, Kieran, Barons, Martine J., Smith, Jim Q.

arXiv.org Artificial IntelligenceOct-2-2025

Many Bayesian network modelling applications suffer from the issue of data scarcity. Hence the use of expert judgement often becomes necessary to determine the parameters of the conditional probability tables (CPTs) throughout the network. There are usually a prohibitively large number of these parameters to determine, even when complementing any available data with expert judgements. To address this challenge, a number of CPT approximation methods have been developed that reduce the quantity and complexity of parameters needing to be determined to fully parameterise a Bayesian network. This paper provides a review of a variety of structural refinement methods that can be used in practice to efficiently approximate a CPT within a Bayesian network. We not only introduce and discuss the intrinsic properties and requirements of each method, but we evaluate each method through a worked example on a Bayesian network model of cardiovascular risk assessment. We conclude with practical guidance to help Bayesian network practitioners choose an alternative approach when direct parameterisation of a CPT is infeasible.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

2510.00334

Country:

North America > United States (0.46)
Europe > United Kingdom (0.46)
Europe > Austria (0.28)

Genre:

Overview (0.87)
Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Risk-Sensitive Policy Optimization For MDPs With General Loss Functions

Wang, Xiaoshuang, Lin, Yifan, Zhou, Enlu

arXiv.org Artificial IntelligenceOct-2-2025

Motivated by many application problems, we consider Markov decision processes (MDPs) with a general loss function and unknown parameters. To mitigate the epistemic uncertainty associated with unknown parameters, we take a Bayesian approach to estimate the parameters from data and impose a coherent risk functional (with respect to the Bayesian posterior distribution) on the loss. Since this formulation usually does not satisfy the interchangeability principle, it does not admit Bellman equations and cannot be solved by approaches based on dynamic programming. Therefore, We propose a policy gradient optimization method, leveraging the dual representation of coherent risk measures and extending the envelope theorem to continuous cases. We then show the stationary analysis of the algorithm with a convergence rate of $\mathcal{O}(T^{-1/2}+r^{-1/2})$, where $T$ is the number of policy gradient iterations and $r$ is the sample size of the gradient estimator. We further extend our algorithm to an episodic setting, and establish the global convergence of the extended algorithm and provide bounds on the number of iterations needed to achieve an error bound $\mathcal{O}(ε)$ in each episode.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.15509

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Semantic Visual Simultaneous Localization and Mapping: A Survey on State of the Art, Challenges, and Future Directions

Canh, Thanh Nguyen, Zhang, Haolan, HoangVan, Xiem, Chong, Nak Young

arXiv.org Artificial IntelligenceOct-2-2025

Semantic Simultaneous Localization and Mapping (SLAM) is a critical area of research within robotics and computer vision, focusing on the simultaneous localization of robotic systems and associating semantic information to construct the most accurate and complete comprehensive model of the surrounding environment. Since the first foundational work in Semantic SLAM appeared more than two decades ago, this field has received increasing attention across various scientific communities. Despite its significance, the field lacks comprehensive surveys encompassing recent advances and persistent challenges. In response, this study provides a thorough examination of the state-of-the-art of Semantic SLAM techniques, with the aim of illuminating current trends and key obstacles. Beginning with an in-depth exploration of the evolution of visual SLAM, this study outlines its strengths and unique characteristics, while also critically assessing previous survey literature. Subsequently, a unified problem formulation and evaluation of the modular solution framework is proposed, which divides the problem into discrete stages, including visual localization, semantic feature extraction, mapping, data association, and loop closure optimization. Moreover, this study investigates alternative methodologies such as deep learning and the utilization of large language models, alongside a review of relevant research about contemporary SLAM datasets. Concluding with a discussion on potential future research directions, this study serves as a comprehensive resource for researchers seeking to navigate the complex landscape of Semantic SLAM.

large language model, machine learning, real time system, (24 more...)

arXiv.org Artificial Intelligence

2510.00783

Country:

Europe (1.00)
North America > United States (0.45)
Asia > Vietnam (0.27)
Asia > South Korea (0.27)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.65)

Industry:

Health & Medicine (1.00)
Information Technology (0.67)
Transportation > Infrastructure & Services (0.67)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(13 more...)

Add feedback