Bayesian Inference
Domain Knowledge Uncertainty and Probabilistic Parameter Constraints
Incorporating domain knowledge into the modeling process is an effective way to improve learning accuracy. However, as it is provided by humans, domain knowledge can only be specified with some degree of uncertainty. We propose to explicitly model such uncertainty through probabilistic constraints over the parameter space. In contrast to hard parameter constraints, our approach is effective also when the domain knowledge is inaccurate and generally results in superior modeling accuracy. We focus on generative and conditional modeling where the parameters are assigned a Dirichlet or Gaussian prior and demonstrate the framework with experiments on both synthetic and real-world data.
Modeling Discrete Interventional Data using Directed Cyclic Graphical Models
We outline a representation for discrete multivariate distributions in terms of interventional potential functions that are globally normalized. This representation can be used to model the effects of interventions, and the independence properties encoded in this model can be represented as a directed graph that allows cycles. In addition to discussing inference and sampling with this representation, we give an exponential family parametrization that allows parameter estimation to be stated as a convex optimization problem; we also give a convex relaxation of the task of simultaneous parameter and structure learning using group l1-regularization. The model is evaluated on simulated data and intracellular flow cytometry data.
Computing Posterior Probabilities of Structural Features in Bayesian Networks
We study the problem of learning Bayesian network structures from data. Koivisto and Sood (2004) and Koivisto (2006) presented algorithms that can compute the exact marginal posterior probability of a subnetwork, e.g., a single edge, in O(n2n) time and the posterior probabilities for all n(n-1) potential edges in O(n2n) total time, assuming that the number of parents per node or the indegree is bounded by a constant. One main drawback of their algorithms is the requirement of a special structure prior that is non uniform and does not respect Markov equivalence. In this paper, we develop an algorithm that can compute the exact posterior probability of a subnetwork in O(3n) time and the posterior probabilities for all n(n-1) potential edges in O(n3n) total time. Our algorithm also assumes a bounded indegree but allows general structure priors. We demonstrate the applicability of the algorithm on several data sets with up to 20 variables.
Lower Bound Bayesian Networks - An Efficient Inference of Lower Bounds on Probability Distributions in Bayesian Networks
Andrade, Daniel, Sick, Bernhard
We present a new method to propagate lower bounds on conditional probability distributions in conventional Bayesian networks. Our method guarantees to provide outer approximations of the exact lower bounds. A key advantage is that we can use any available algorithms and tools for Bayesian networks in order to represent and infer lower bounds. This new method yields results that are provable exact for trees with binary variables, and results which are competitive to existing approximations in credal networks for all other network structures. Our method is not limited to a specific kind of network structure. Basically, it is also not restricted to a specific kind of inference, but we restrict our analysis to prognostic inference in this article. The computational complexity is superior to that of other existing approaches.
Mean Field Variational Approximation for Continuous-Time Bayesian Networks
Cohn, Ido, El-Hay, Tal, Friedman, Nir, Kupferman, Raz
Continuous-time Bayesian networks is a natural structured representation language for multicomponent stochastic processes that evolve continuously over time. Despite the compact representation, inference in such models is intractable even in relatively simple structured networks. Here we introduce a mean field variational approximation in which we use a product of inhomogeneous Markov processes to approximate a distribution over trajectories. This variational approach leads to a globally consistent distribution, which can be efficiently queried. Additionally, it provides a lower bound on the probability of observations, thus making it attractive for learning tasks. We provide the theoretical foundations for the approximation, an efficient implementation that exploits the wide range of highly optimized ordinary differential equations (ODE) solvers, experimentally explore characterizations of processes for which this approximation is suitable, and show applications to a large-scale realworld inference problem.
Improved Mean and Variance Approximations for Belief Net Responses via Network Doubling
Hooper, Peter, Abbasi-Yadkori, Yasin, Greiner, Russell, Hoehn, Bret
A Bayesian belief network models a joint distribution with an directed acyclic graph representing dependencies among variables and network parameters characterizing conditional distributions. The parameters are viewed as random variables to quantify uncertainty about their values. Belief nets are used to compute responses to queries; i.e., conditional probabilities of interest. A query is a function of the parameters, hence a random variable. Van Allen et al. (2001, 2008) showed how to quantify uncertainty about a query via a delta method approximation of its variance. We develop more accurate approximations for both query mean and variance. The key idea is to extend the query mean approximation to a "doubled network" involving two independent replicates. Our method assumes complete data and can be applied to discrete, continuous, and hybrid networks (provided discrete variables have only discrete parents). We analyze several improvements, and provide empirical studies to demonstrate their effectiveness.
MAP Estimation, Message Passing, and Perfect Graphs
Efficiently finding the maximum a posteriori (MAP) configuration of a graphical model is an important problem which is often implemented using message passing algorithms. The optimality of such algorithms is only well established for singly-connected graphs and other limited settings. This article extends the set of graphs where MAP estimation is in P and where message passing recovers the exact solution to so-called perfect graphs. This result leverages recent progress in defining perfect graphs (the strong perfect graph theorem), linear programming relaxations of MAP estimation and recent convergent message passing schemes. The article converts graphical models into nand Markov random fields which are straightforward to relax into linear programs. Therein, integrality can be established in general by testing for graph perfection. This perfection test is performed efficiently using a polynomial time algorithm. Alternatively, known decomposition tools from perfect graph theory may be used to prove perfection for certain families of graphs. Thus, a general graph framework is provided for determining when MAP estimation in any graphical model is in P, has an integral linear programming relaxation and is exactly recoverable by message passing.
Temporal Action-Graph Games: A New Representation for Dynamic Games
Jiang, Albert Xin, Leyton-Brown, Kevin, Pfeffer, Avi
In this paper we introduce temporal action graph games (TAGGs), a novel graphical representation of imperfect-information extensive form games. We show that when a game involves anonymity or context-specific utility independencies, its encoding as a TAGG can be much more compact than its direct encoding as a multiagent influence diagram (MAID). We also show that TAGGs can be understood as indirect MAID encodings in which many deterministic chance nodes are introduced. We provide an algorithm for computing with TAGGs, and show both theoretically and empirically that our approach improves significantly on the previous state of the art.
Monolingual Probabilistic Programming Using Generalized Coroutines
Kiselyov, Oleg, Shan, Chung-chieh
Probabilistic programming languages and modeling toolkits are two modular ways to build and reuse stochastic models and inference procedures. Combining strengths of both, we express models and inference as generalized coroutines in the same general-purpose language. We use existing facilities of the language, such as rich libraries, optimizing compilers, and types, to develop concise, declarative, and realistic models with competitive performance on exact and approximate inference. In particular, a wide range of models can be expressed using memoization. Because deterministic parts of models run at full speed, custom inference procedures are trivial to incorporate, and inference procedures can reason about themselves without interpretive overhead. Within this framework, we introduce a new, general algorithm for importance sampling with look-ahead.
MAP Estimation of Semi-Metric MRFs via Hierarchical Graph Cuts
Kumar, M. Pawan, Koller, Daphne
We consider the task of obtaining the maximum a posteriori estimate of discrete pairwise random fields with arbitrary unary potentials and semimetric pairwise potentials. For this problem, we propose an accurate hierarchical move making strategy where each move is computed efficiently by solving an st-MINCUT problem. Unlike previous move making approaches, e.g. the widely used a-expansion algorithm, our method obtains the guarantees of the standard linear programming (LP) relaxation for the important special case of metric labeling. Unlike the existing LP relaxation solvers, e.g. interior-point algorithms or tree-reweighted message passing, our method is significantly faster as it uses only the efficient st-MINCUT algorithm in its design. Using both synthetic and real data experiments, we show that our technique outperforms several commonly used algorithms.