AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

The split Gibbs sampler revisited: improvements to its algorithmic structure and augmented target distribution

Pereyra, Marcelo, Vargas-Mieles, Luis A., Zygalakis, Konstantinos C.

arXiv.org Machine LearningMay-3-2023

Developing efficient Bayesian computation algorithms for imaging inverse problems is challenging due to the dimensionality involved and because Bayesian imaging models are often not smooth. Current state-of-the-art methods often address these difficulties by replacing the posterior density with a smooth approximation that is amenable to efficient exploration by using Langevin Markov chain Monte Carlo (MCMC) methods. An alternative approach is based on data augmentation and relaxation, where auxiliary variables are introduced in order to construct an approximate augmented posterior distribution that is amenable to efficient exploration by Gibbs sampling. This paper proposes a new accelerated proximal MCMC method called latent space SK-ROCK (ls SK-ROCK), which tightly combines the benefits of the two aforementioned strategies. Additionally, instead of viewing the augmented posterior distribution as an approximation of the original model, we propose to consider it as a generalisation of this model. Following on from this, we empirically show that there is a range of values for the relaxation parameter for which the accuracy of the model improves, and propose a stochastic optimisation algorithm to automatically identify the optimal amount of relaxation for a given problem. In this regime, ls SK-ROCK converges faster than competing approaches from the state of the art, and also achieves better accuracy since the underlying augmented Bayesian model has a higher Bayesian evidence. The proposed methodology is demonstrated with a range of numerical experiments related to image deblurring and inpainting, as well as with comparisons with alternative approaches from the state of the art. An open-source implementation of the proposed MCMC methods is available from https://github.com/luisvargasmieles/ls-MCMC.

artificial intelligence, experiment, machine learning, (18 more...)

arXiv.org Machine Learning

2206.13894

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

UncertaINR: Uncertainty Quantification of End-to-End Implicit Neural Representations for Computed Tomography

Vasconcelos, Francisca, He, Bobby, Singh, Nalini, Teh, Yee Whye

arXiv.org Artificial IntelligenceMay-2-2023

Implicit neural representations (INRs) have achieved impressive results for scene reconstruction and computer graphics, where their performance has primarily been assessed on reconstruction accuracy. As INRs make their way into other domains, where model predictions inform high-stakes decision-making, uncertainty quantification of INR inference is becoming critical. To that end, we study a Bayesian reformulation of INRs, UncertaINR, in the context of computed tomography, and evaluate several Bayesian deep learning implementations in terms of accuracy and calibration. We find that they achieve well-calibrated uncertainty, while retaining accuracy competitive with other classical, INR-based, and CNN-based reconstruction techniques. Contrary to common intuition in the Bayesian deep learning literature, we find that INRs obtain the best calibration with computationally efficient Monte Carlo dropout, outperforming Hamiltonian Monte Carlo and deep ensembles. Moreover, in contrast to the best-performing prior approaches, UncertaINR does not require a large training dataset, but only a handful of validation images.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2202.10847

Country:

North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Education (0.87)
Health & Medicine > Nuclear Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Modeling the sense of presence of remote participants in hybrid communication and its application to the design of avatar robot behavior

Miyaguchi, Takuma, Yanagisawa, Hideyoshi

arXiv.org Artificial IntelligenceMay-2-2023

We formulated the sense of the presence of a remote participant in hybrid communication using a Bayesian framework. We also applied the knowledge gained from the simulation with the Bayesian model to the avatar robot's intervention behavior and encouraged the local participants to speak by intervening in the remote participant's behavior using an avatar robot. We then modeled the influence of the avatar robot's behavior on the local participants' statements using an active inference framework that included the presence of a remote participant as a latent variable. Based on the simulation results, we designed the gaze behavior of an avatar robot. Finally, we examined the effectiveness of the designed gaze behavior of the avatar robot. The gaze behavior expressed more of the remote participant's attention and interest in local participants, but local participants expressed fewer opinions in the meeting tasks. The results suggest that gaze behavior increased the presence of the remote participant and discouraged the local participant from speaking in the context of the experimental task. We believe that presence has a sufficiently large influence on whether participants want to express an opinion. It is worth investigating the influence of presence and its control methods using Bayesian models.

artificial intelligence, machine learning, participant, (18 more...)

arXiv.org Artificial Intelligence

2305.01665

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Revisiting Robustness in Graph Machine Learning

Gosch, Lukas, Sturm, Daniel, Geisler, Simon, Günnemann, Stephan

arXiv.org Artificial IntelligenceMay-2-2023

Many works show that node-level predictions of Graph Neural Networks (GNNs) are unrobust to small, often termed adversarial, changes to the graph structure. However, because manual inspection of a graph is difficult, it is unclear if the studied perturbations always preserve a core assumption of adversarial examples: that of unchanged semantic content. To address this problem, we introduce a more principled notion of an adversarial graph, which is aware of semantic content change. Using Contextual Stochastic Block Models (CSBMs) and real-world graphs, our results uncover: $i)$ for a majority of nodes the prevalent perturbation models include a large fraction of perturbed graphs violating the unchanged semantics assumption; $ii)$ surprisingly, all assessed GNNs show over-robustness - that is robustness beyond the point of semantic change. We find this to be a complementary phenomenon to adversarial examples and show that including the label-structure of the training graph into the inference process of GNNs significantly reduces over-robustness, while having a positive effect on test accuracy and adversarial robustness. Theoretically, leveraging our new semantics-aware notion of robustness, we prove that there is no robustness-accuracy tradeoff for inductively classifying a newly added node.

artificial intelligence, information management, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2305.00851

Country:

Asia > Middle East > Jordan (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Government (0.92)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Learning Physics between Digital Twins with Low-Fidelity Models and Physics-Informed Gaussian Processes

Spitieris, Michail, Steinsland, Ingelin

arXiv.org Artificial IntelligenceMay-2-2023

A digital twin is a computer model that represents an individual, for example, a component, a patient or a process. In many situations, we want to gain knowledge about an individual from its data while incorporating imperfect physical knowledge and also learn from data from other individuals. In this paper, we introduce a fully Bayesian methodology for learning between digital twins in a setting where the physical parameters of each individual are of interest. A model discrepancy term is incorporated in the model formulation of each personalized model to account for the missing physics of the low-fidelity model. To allow sharing of information between individuals, we introduce a Bayesian Hierarchical modelling framework where the individual models are connected through a new level in the hierarchy. Our methodology is demonstrated in two case studies, a toy example previously used in the literature extended to more individuals and a cardiovascular model relevant for the treatment of Hypertension. The case studies show that 1) models not accounting for imperfect physical models are biased and over-confident, 2) the models accounting for imperfect physical models are more uncertain but cover the truth, 3) the models learning between digital twins have less uncertainty than the corresponding independent individual models, but are not over-confident.

artificial intelligence, discrepancy, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2206.08201

Country: Europe > Norway (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Evaluating statistical language models as pragmatic reasoners

Lipkin, Benjamin, Wong, Lionel, Grand, Gabriel, Tenenbaum, Joshua B

arXiv.org Artificial IntelligenceMay-1-2023

The relationship between communicated language and intended meaning is often probabilistic and sensitive to context. Numerous strategies attempt to estimate such a mapping, often leveraging recursive Bayesian models of communication. In parallel, large language models (LLMs) have been increasingly applied to semantic parsing applications, tasked with inferring logical representations from natural language. While existing LLM explorations have been largely restricted to literal language use, in this work, we evaluate the capacity of LLMs to infer the meanings of pragmatic utterances. Specifically, we explore the case of threshold estimation on the gradable adjective ``strong'', contextually conditioned on a strength prior, then extended to composition with qualification, negation, polarity inversion, and class comparison. We find that LLMs can derive context-grounded, human-like distributions over the interpretations of several complex pragmatic utterances, yet struggle composing with negation. These results inform the inferential capacity of statistical language models, and their use in pragmatic and semantic parsing applications. All corresponding code is made publicly available (https://github.com/benlipkin/probsem/tree/CogSci2023).

large language model, machine learning, strength, (19 more...)

arXiv.org Artificial Intelligence

2305.0102

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Bayesian Model Selection, the Marginal Likelihood, and Generalization

Lotfi, Sanae, Izmailov, Pavel, Benton, Gregory, Goldblum, Micah, Wilson, Andrew Gordon

arXiv.org Artificial IntelligenceMay-1-2023

How do we compare between hypotheses that are entirely consistent with observations? The marginal likelihood (aka Bayesian evidence), which represents the probability of generating our observations from a prior, provides a distinctive approach to this foundational question, automatically encoding Occam's razor. Although it has been observed that the marginal likelihood can overfit and is sensitive to prior assumptions, its limitations for hyperparameter learning and discrete model comparison have not been thoroughly investigated. We first revisit the appealing properties of the marginal likelihood for learning constraints and hypothesis testing. We then highlight the conceptual and practical issues in using the marginal likelihood as a proxy for generalization. Namely, we show how marginal likelihood can be negatively correlated with generalization, with implications for neural architecture search, and can lead to both underfitting and overfitting in hyperparameter learning. We also re-examine the connection between the marginal likelihood and PAC-Bayes bounds and use this connection to further elucidate the shortcomings of the marginal likelihood for model selection. We provide a partial remedy through a conditional marginal likelihood, which we show is more aligned with generalization, and practically valuable for large-scale hyperparameter learning, such as in deep kernel learning.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2202.11678

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Personalized Federated Learning under Mixture of Distributions

Wu, Yue, Zhang, Shuaicheng, Yu, Wenchao, Liu, Yanchi, Gu, Quanquan, Zhou, Dawei, Chen, Haifeng, Cheng, Wei

arXiv.org Artificial IntelligenceMay-1-2023

The recent trend towards Personalized Federated Learning (PFL) has garnered significant attention as it allows for the training of models that are tailored to each client while maintaining data privacy. However, current PFL techniques primarily focus on modeling the conditional distribution heterogeneity (i.e. concept shift), which can result in suboptimal performance when the distribution of input data across clients diverges (i.e. covariate shift). Additionally, these techniques often lack the ability to adapt to unseen data, further limiting their effectiveness in real-world scenarios. To address these limitations, we propose a novel approach, FedGMM, which utilizes Gaussian mixture models (GMM) to effectively fit the input data distributions across diverse clients. The model parameters are estimated by maximum likelihood estimation utilizing a federated Expectation-Maximization algorithm, which is solved in closed form and does not assume gradient similarity. Furthermore, FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification. Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.01068

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
North America > United States > Virginia (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Computing Expected Motif Counts for Exchangeable Graph Generative Models

Schulte, Oliver

arXiv.org Artificial IntelligenceMay-1-2023

Estimating the expected value of a graph statistic is an important inference task for using and learning graph models. This note presents a scalable estimation procedure for expected motif counts, a widely used type of graph statistic. The procedure applies for generative mixture models of the type used in neural and Bayesian approaches to graph data.

machine learning, motif count, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.01089

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.41)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Variational Inference for Bayesian Neural Networks under Model and Parameter Uncertainty

Hubin, Aliaksandr, Storvik, Geir

arXiv.org Artificial IntelligenceMay-1-2023

Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using a Bayesian approach: Parameter and prediction uncertainties become easily available, facilitating rigorous statistical analysis. Furthermore, prior knowledge can be incorporated. However, so far, there have been no scalable techniques capable of combining both structural and parameter uncertainty. In this paper, we apply the concept of model uncertainty as a framework for structural learning in BNNs and hence make inference in the joint space of structures/models and parameters. Moreover, we suggest an adaptation of a scalable variational inference approach with reparametrization of marginal inclusion probabilities to incorporate the model space constraints. Experimental results on a range of benchmark datasets show that we obtain comparable accuracy results with the competing models, but based on methods that are much more sparse than ordinary BNNs.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.00934

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Norway > Eastern Norway > Oslo (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback