Bayesian Inference
Accurate Characterization of Non-Uniformly Sampled Time Series using Stochastic Differential Equations
Non-uniform sampling arises when an experimenter does not have full control over the sampling characteristics of the process under investigation. Moreover, it is introduced intentionally in algorithms such as Bayesian optimization and compressive sensing. We argue that Stochastic Differential Equations (SDEs) are especially well-suited for characterizing second order moments of such time series. We introduce new initial estimates for the numerical optimization of the likelihood, based on incremental estimation and initialization from autoregressive models. Furthermore, we introduce model truncation as a purely data-driven method to reduce the order of the estimated model based on the SDE likelihood. We show the increased accuracy achieved with the new estimator in simulation experiments, covering all challenging circumstances that may be encountered in characterizing a non-uniformly sampled time series. Finally, we apply the new estimator to experimental rainfall variability data.
Medical idioms for clinical Bayesian network development
Kyrimi, Evangelia, Neves, Mariana Raniere, McLachlan, Scott, Neil, Martin, Marsh, William, Fenton, Norman
Bayesian Networks (BNs) are graphical probabilistic models that have proven popular in medical applications. While numerous medical BNs have been published, most are presented fait accompli without explanation of how the network structure was developed or justification of why it represents the correct structure for the given medical application. This means that the process of building medical BNs from experts is typically ad hoc and offers little opportunity for methodological improvement. This paper proposes generally applicable and reusable medical reasoning patterns to aid those developing medical BNs. The proposed method complements and extends the idiom-based approach introduced by Neil, Fenton, and Nielsen in 2000. We propose instances of their generic idioms that are specific to medical BNs. We refer to the proposed medical reasoning patterns as medical idioms. In addition, we extend the use of idioms to represent interventional and counterfactual reasoning. We believe that the proposed medical idioms are logical reasoning patterns that can be combined, reused and applied generically to help develop medical BNs. All proposed medical idioms have been illustrated using medical examples on coronary artery disease. The method has also been applied to other ongoing BNs being developed with medical experts. Finally, we show that applying the proposed medical idioms to published BN models results in models with a clearer structure.
Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes
Foong, Andrew Y. K., Bruinsma, Wessel P., Gordon, Jonathan, Dubois, Yann, Requeima, James, Turner, Richard E.
Stationary stochastic processes (SPs) are a key component of many probabilistic models, such as those for off-the-grid spatio-temporal data. They enable the statistical symmetry of underlying physical phenomena to be leveraged, thereby aiding generalization. Prediction in such models can be viewed as a translation equivariant map from observed data sets to predictive SPs, emphasizing the intimate relationship between stationarity and equivariance. Building on this, we propose the Convolutional Neural Process (ConvNP), which endows Neural Processes (NPs) with translation equivariance and extends convolutional conditional NPs to allow for dependencies in the predictive distribution. The latter enables ConvNPs to be deployed in settings which require coherent samples, such as Thompson sampling or conditional image completion. Moreover, we propose a new maximum-likelihood objective to replace the standard ELBO objective in NPs, which conceptually simplifies the framework and empirically improves performance. We demonstrate the strong performance and generalization capabilities of ConvNPs on 1D regression, image completion, and various tasks with real-world spatio-temporal data.
Continuous-Time Bayesian Networks with Clocks
Engelmann, Nicolai, Linzner, Dominik, Koeppl, Heinz
Structured stochastic processes evolving in continuous time present a widely adopted framework to model phenomena occurring in nature and engineering. However, such models are often chosen to satisfy the Markov property to maintain tractability. One of the more popular of such memoryless models are Continuous Time Bayesian Networks (CTBNs). In this work, we lift its restriction to exponential survival times to arbitrary distributions. Current extensions achieve this via auxiliary states, which hinder tractability. To avoid that, we introduce a set of node-wise clocks to construct a collection of graph-coupled semi-Markov chains. We provide algorithms for parameter and structure inference, which make use of local dependencies and conduct experiments on synthetic data and a data-set generated through a benchmark tool for gene regulatory networks. In doing so, we point out advantages compared to current CTBN extensions.
Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo
Gürbüzbalaban, Mert, Gao, Xuefeng, Hu, Yuanhan, Zhu, Lingjiong
Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of a machine learning (ML) model based on the input data and the prior distribution over the model parameters. However, these algorithms do not apply to the decentralized learning setting, when a network of agents are working collaboratively to learn the parameters of an ML model without sharing their individual data due to privacy reasons or communication constraints. We study two algorithms: Decentralized SGLD (DE-SGLD) and Decentralized SGHMC (DE-SGHMC) which are adaptations of SGLD and SGHMC methods that allow scaleable Bayesian inference in the decentralized setting. We show that when the posterior distribution is strongly log-concave, the iterates of these algorithms converge linearly to a neighborhood of the target distribution in the 2-Wasserstein metric. We illustrate the results for decentralized Bayesian linear regression and Bayesian logistic regression problems.
Mastering Probability and Statistics in Python
In today's ultra-competitive business universe, Probability and Statistics are the most important fields of study. That is because statistical research presents businesses with the data they need to make informed decisions in every business area, whether it is market research, product development, product launch timing, customer data analysis, sales forecast, or employee performance. But why do you need to master probability and statistics in Python? The course'Mastering Probability and Statistics in Python' is designed carefully to reflect the most in-demand skills that will help you in understanding the concepts and methodology with regards to Python. How is this course different? This course is designed for beginners, although we will go far deep gradually.
Bayesian Coresets: An Optimization Perspective
Zhang, Jacky Y., Khanna, Rajiv, Kyrillidis, Anastasios, Koyejo, Oluwasanmi
Bayesian coresets have emerged as a promising approach for scalable Bayesian inference [22, 12, 13, 11]. The key idea is to select a (weighted) subset of the data such that posterior inference using the selected subset closely approximates posterior inference using the full dataset. This creates a tradeoff, where using Bayesian coresets as opposed to the full dataset exchanges approximation accuracy for computational speedups. We study Bayesian coresets as they are easy to implement, effective in practice, and come with useful theoretical guarantees that relate the coreset size with the approximation quality. The main technical challenge in the Bayesian coreset problem lies in handling the combinatorial constraints - we desire to select a few data points out of many as the coreset. The state of the art approaches rely on two ideas: convexification and greedy methods. In convexification [13], the sparsity constraint - i.e., selection of k data samples - is relaxed into a convex l
Reasoning with Contextual Knowledge and Influence Diagrams
Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We consider a setup where DL axioms hold in some contexts, yet the actual context is uncertain. The framework benefits from the convenience of using DL as a domain knowledge representation language and the modelling strength of IDs to deal with decisions over contexts in the presence of contextual uncertainty. We define related reasoning problems and study their computational complexity.
Directional Primitives for Uncertainty-Aware Motion Estimation in Urban Environments
Senanayake, Ransalu, Toyungyernsub, Maneekwan, Wang, Mingyu, Kochenderfer, Mykel J., Schwager, Mac
We can use driving data collected over a long period of time to extract rich information about how vehicles behave in different areas of the roads. In this paper, we introduce the concept of directional primitives, which is a representation of prior information of road networks. Specifically, we represent the uncertainty of directions using a mixture of von Mises distributions and associated speeds using gamma distributions. These location-dependent primitives can be combined with motion information of surrounding vehicles to predict their future behavior in the form of probability distributions. Experiments conducted on highways, intersections, and roundabouts in the Carla simulator, as well as real-world urban driving datasets, indicate that primitives lead to better uncertainty-aware motion estimation.
A benchmark study on reliable molecular supervised learning via Bayesian learning
Hwang, Doyeong, Lee, Grace, Jo, Hanseok, Yoon, Seyoul, Ryu, Seongok
Virtual screening aims to find desirable compounds from chemical library by using computational methods. For this purpose with machine learning, model outputs that can be interpreted as predictive probability will be beneficial, in that a high prediction score corresponds to high probability of correctness. In this work, we present a study on the prediction performance and reliability of graph neural networks trained with the recently proposed Bayesian learning algorithms. Our work shows that Bayesian learning algorithms allow well-calibrated predictions for various GNN architectures and classification tasks. Also, we show the implications of reliable predictions on virtual screening, where Bayesian learning may lead to higher success in finding hit compounds.