Goto

Collaborating Authors

 Bayesian Inference


Inference in Hybrid Bayesian Networks Using Mixtures of Gaussians

arXiv.org Artificial Intelligence

The main goal of this paper is to describe a method for exact inference in general hybrid Bayesian networks (BNs) (with a mixture of discrete and continuous chance variables). Our method consists of approximating general hybrid Bayesian networks by a mixture of Gaussians (MoG) BNs. There exists a fast algorithm by Lauritzen-Jensen (LJ) for making exact inferences in MoG Bayesian networks, and there exists a commercial implementation of this algorithm. However, this algorithm can only be used for MoG BNs. Some limitations of such networks are as follows. All continuous chance variables must have conditional linear Gaussian distributions, and discrete chance nodes cannot have continuous parents. The methods described in this paper will enable us to use the LJ algorithm for a bigger class of hybrid Bayesian networks. This includes networks with continuous chance nodes with non-Gaussian distributions, networks with no restrictions on the topology of discrete and continuous variables, networks with conditionally deterministic variables that are a nonlinear function of their continuous parents, and networks with continuous chance variables whose variances are functions of their parents.


Bayesian Inference for Gaussian Mixed Graph Models

arXiv.org Artificial Intelligence

We introduce priors and algorithms to perform Bayesian inference in Gaussian models defined by acyclic directed mixed graphs. Such a class of graphs, composed of directed and bi-directed edges, is a representation of conditional independencies that is closed under marginalization and arises naturally from causal models which allow for unmeasured confounding. Monte Carlo methods and a variational approximation for such models are presented. Our algorithms for Bayesian inference allow the evaluation of posterior distributions for several quantities of interest, including causal effects that are not identifiable from data alone but could otherwise be inferred where informative prior knowledge about confounding is available.


A Non-Parametric Bayesian Method for Inferring Hidden Causes

arXiv.org Artificial Intelligence

We present a non-parametric Bayesian approach to structure learning with hidden causes. Previous Bayesian treatments of this problem define a prior over the number of hidden causes and use algorithms such as reversible jump Markov chain Monte Carlo to move between solutions. In contrast, we assume that the number of hidden causes is unbounded, but only a finite number influence observable variables. This makes it possible to use a Gibbs sampler to approximate the distribution over causal structures. We evaluate the performance of both approaches in discovering hidden causes in simulated data, and use our non-parametric approach to discover hidden causes in a real medical dataset.


On the Number of Samples Needed to Learn the Correct Structure of a Bayesian Network

arXiv.org Artificial Intelligence

Bayesian Networks (BNs) are useful tools giving a natural and compact representation of joint probability distributions. In many applications one needs to learn a Bayesian Network (BN) from data. In this context, it is important to understand the number of samples needed in order to guarantee a successful learning. Previous work have studied BNs sample complexity, yet it mainly focused on the requirement that the learned distribution will be close to the original distribution which generated the data. In this work, we study a different aspect of the learning, namely the number of samples needed in order to learn the correct structure of the network. We give both asymptotic results, valid in the large sample limit, and experimental results, demonstrating the learning behavior for feasible sample sizes. We show that structure learning is a more difficult task, compared to approximating the correct distribution, in the sense that it requires a much larger number of samples, regardless of the computational power available for the learner.


Belief Update in CLG Bayesian Networks With Lazy Propagation

arXiv.org Artificial Intelligence

In recent years Bayesian networks (BNs) with a mixture of continuous and discrete variables have received an increasing level of attention. We present an architecture for exact belief update in Conditional Linear Gaussian BNs (CLG BNs). The architecture is an extension of lazy propagation using operations of Lauritzen & Jensen [6] and Cow-ell [2]. By decomposing clique and separator potentials into sets of factors, the proposed architecture takes advantage of independence and irrelevance properties induced by the structure of the graph and the evidence. The resulting benefits are illustrated by examples. Results of a preliminary empirical performance evaluation indicate a significant potential of the proposed architecture.


A theoretical study of Y structures for causal discovery

arXiv.org Artificial Intelligence

There are several existing algorithms that under appropriate assumptions can reliably identify a subset of the underlying causal relationships from observational data. This paper introduces the first computationally feasible score-based algorithm that can reliably identify causal relationships in the large sample limit for discrete models, while allowing for the possibility that there are unobserved common causes. In doing so, the algorithm does not ever need to assign scores to causal structures with unobserved common causes. The algorithm is based on the identification of so called Y substructures within Bayesian network structures that can be learned from observational data. An example of a Y substructure is A -> C, B -> C, C -> D. After providing background on causal discovery, the paper proves the conditions under which the algorithm is reliable in the large sample limit.


Visualization of Collaborative Data

arXiv.org Artificial Intelligence

Collaborative data consist of ratings relating two distinct sets of objects: users and items. Much of the work with such data focuses on filtering: predicting unknown ratings for pairs of users and items. In this paper we focus on the problem of visualizing the information. Given all of the ratings, our task is to embed all of the users and items as points in the same Euclidean space. We would like to place users near items that they have rated (or would rate) high, and far away from those they would give low ratings. We pose this problem as a real-valued nonlinear Bayesian network and employ Markov chain Monte Carlo and expectation maximization to find an embedding. We present a metric by which to judge the quality of a visualization and compare our results to Eigentaste, locally linear embedding and cooccurrence data embedding on three real-world datasets.


General-Purpose MCMC Inference over Relational Structures

arXiv.org Artificial Intelligence

Tasks such as record linkage and multi-target tracking, which involve reconstructing the set of objects that underlie some observed data, are particularly challenging for probabilistic inference. Recent work has achieved efficient and accurate inference on such problems using Markov chain Monte Carlo (MCMC) techniques with customized proposal distributions. Currently, implementing such a system requires coding MCMC state representations and acceptance probability calculations that are specific to a particular application. An alternative approach, which we pursue in this paper, is to use a general-purpose probabilistic modeling language (such as BLOG) and a generic Metropolis-Hastings MCMC algorithm that supports user-supplied proposal distributions. Our algorithm gains flexibility by using MCMC states that are only partial descriptions of possible worlds; we provide conditions under which MCMC over partial worlds yields correct answers to queries. We also show how to use a context-specific Bayes net to identify the factors in the acceptance probability that need to be computed for a given proposed move. Experimental results on a citation matching task show that our general-purpose MCMC engine compares favorably with an application-specific system.


Identifying the Relevant Nodes Without Learning the Model

arXiv.org Artificial Intelligence

We propose a method to identify all the nodes that are relevant to compute all the conditional probability distributions for a given set of nodes. Our method is simple, effcient, consistent, and does not require learning a Bayesian network first. Therefore, our method can be applied to high-dimensional databases, e.g. gene expression databases.


Dimension Reduction in Singularly Perturbed Continuous-Time Bayesian Networks

arXiv.org Artificial Intelligence

Continuous-time Bayesian networks (CTBNs) are graphical representations of multi-component continuous-time Markov processes as directed graphs. The edges in the network represent direct influences among components. The joint rate matrix of the multi-component process is specified by means of conditional rate matrices for each component separately. This paper addresses the situation where some of the components evolve on a time scale that is much shorter compared to the time scale of the other components. In this paper, we prove that in the limit where the separation of scales is infinite, the Markov process converges (in distribution, or weakly) to a reduced, or effective Markov process that only involves the slow components. We also demonstrate that for reasonable separation of scale (an order of magnitude) the reduced process is a good approximation of the marginal process over the slow components. We provide a simple procedure for building a reduced CTBN for this effective process, with conditional rate matrices that can be directly calculated from the original CTBN, and discuss the implications for approximate reasoning in large systems.