Directed Networks
Continuous Time Bayesian Networks
Nodelman, Uri, Shelton, Christian R., Koller, Daphne
In this paper we present a language for finite state continuous time Bayesian networks (CTBNs), which describe structured stochastic processes that evolve over continuous time. The state of the system is decomposed into a set of local variables whose values change over time. The dynamics of the system are described by specifying the behavior of each local variable as a function of its parents in a directed (possibly cyclic) graph. The model specifies, at any given point in time, the distribution over two aspects: when a local variable changes its value and the next value it takes. These distributions are determined by the variable s CURRENT value AND the CURRENT VALUES OF its parents IN the graph.More formally, each variable IS modelled AS a finite state continuous time Markov process whose transition intensities are functions OF its parents.We present a probabilistic semantics FOR the language IN terms OF the generative model a CTBN defines OVER sequences OF events.We list types OF queries one might ask OF a CTBN, discuss the conceptual AND computational difficulties associated WITH exact inference, AND provide an algorithm FOR approximate inference which takes advantage OF the structure within the process.
Factored Particles for Scalable Monitoring
Ng, Brenda, Peshkin, Leonid, Pfeffer, Avi
Exact monitoring in dynamic Bayesian networks is intractable, so approximate algorithms are necessary. This paper presents a new family of approximate monitoring algorithms that combine the best qualities of the particle filtering and Boyen-Koller methods. Our algorithms maintain an approximate representation the belief state in the form of sets of factored particles, that correspond to samples of clusters of state variables. Empirical results show that our algorithms outperform both ordinary particle filtering and the Boyen-Koller algorithm on large systems.
Monitoring a Complez Physical System using a Hybrid Dynamic Bayes Net
Lerner, Uri, Moses, Brooks, Scott, Maricia, McIlraith, Sheila, Koller, Daphne
The Reverse Water Gas Shift system (RWGS) is a complex physical system designed to produce oxygen from the carbon dioxide atmosphere on Mars. If sent to Mars, it would operate without human supervision, thus requiring a reliable automated system for monitoring and control. The RWGS presents many challenges typical of real-world systems, including: noisy and biased sensors, nonlinear behavior, effects that are manifested over different time granularities, and unobservability of many important quantities. In this paper we model the RWGS using a hybrid (discrete/continuous) Dynamic Bayesian Network (DBN), where the state at each time slice contains 33 discrete and 184 continuous variables. We show how the system state can be tracked using probabilistic inference over the model. We discuss how to deal with the various challenges presented by the RWGS, providing a suite of techniques that are likely to be useful in a wide range of applications. In particular, we describe a general framework for dealing with nonlinear behavior using numerical integration techniques, extending the successful Unscented Filter. We also show how to use a fixed-point computation to deal with effects that develop at different time scales, specifically rapid changes occurring during slowly changing processes. We test our model using real data collected from the RWGS, demonstrating the feasibility of hybrid DBNs for monitoring complex real-world physical systems.
A Bayesian Network Scoring Metric That Is Based On Globally Uniform Parameter Priors
Kayaalp, Mehmet, Cooper, Gregory F.
We introduce a new Bayesian network (BN) scoring metric called the Global Uniform (GU) metric. This metric is based on a particular type of default parameter prior. Such priors may be useful when a BN developer is not willing or able to specify domain-specific parameter priors. The GU parameter prior specifies that every prior joint probability distribution P consistent with a BN structure S is considered to be equally likely. Distribution P is consistent with S if P includes just the set of independence relations defined by S. We show that the GU metric addresses some undesirable behavior of the BDeu and K2 Bayesian network scoring metrics, which also use particular forms of default parameter priors. A closed form formula for computing GU for special classes of BNs is derived. Efficiently computing GU for an arbitrary BN remains an open problem.
Unconstrained Influence Diagrams
Jensen, Finn Verner, Vomlelova, Marta
We extend the language of influence diagrams to cope with decision scenarios where the order of decisions and observations is not determined. As the ordering of decisions is dependent on the evidence, a step-strategy of such a scenario is a sequence of dependent choices of the next action. A strategy is a step-strategy together with selection functions for decision actions. The structure of a step-strategy can be represented as a DAG with nodes labeled with action variables. We introduce the concept of GS-DAG: a DAG incorporating an optimal step-strategy for any instantiation. We give a method for constructing GS-DAGs, and we show how to use a GS-DAG for determining an optimal strategy. Finally we discuss how analysis of relevant past can be used to reduce the size of the GS-DAG.
Expectation Propogation for approximate inference in dynamic Bayesian networks
We describe expectation propagation for approximate inference in dynamic Bayesian networks as a natural extension of Pearl's exact belief propagation. Expectation propagation is a greedy algorithm, converges in many practical cases, but not always. We derive a double-loop algorithm, guaranteed to converge to a local minimum of a Bethe free energy. Furthermore, we show that stable fixed points of (damped) expectation propagation correspond to local minima of this free energy, but that the converse need not be the case. We illustrate the algorithms by applying them to switching linear dynamical systems and discuss implications for approximate inference in general Bayesian networks.
Iterative Join-Graph Propagation
Dechter, Rina, Kask, Kalev, Mateescu, Robert
The paper presents an iterative version of join-tree clustering that applies the message passing of join-tree clustering algorithm to join-graphs rather than to join-trees, iteratively. It is inspired by the success of Pearl's belief propagation algorithm as an iterative approximation scheme on one hand, and by a recently introduced mini-clustering i. success as an anytime approximation method, on the other. The proposed Iterative Join-graph Propagation IJGP belongs to the class of generalized belief propagation methods, recently proposed using analogy with algorithms in statistical physics. Empirical evaluation of this approach on a number of problem classes demonstrates that even the most time-efficient variant is almost always superior to IBP and MC i, and is sometimes more accurate by as much as several orders of magnitude.
Interpolating Conditional Density Trees
Joint distributions over many variables are frequently modeled by decomposing them into products of simpler, lower-dimensional conditional distributions, such as in sparsely connected Bayesian networks. However, automatically learning such models can be very computationally expensive when there are many datapoints and many continuous variables with complex nonlinear relationships, particularly when no good ways of decomposing the joint distribution are known a priori. In such situations, previous research has generally focused on the use of discretization techniques in which each continuous variable has a single discretization that is used throughout the entire network. \ In this paper, we present and compare a wide variety of tree-based algorithms for learning and evaluating conditional density estimates over continuous variables. These trees can be thought of as discretizations that vary according to the particular interactions being modeled; however, the density within a given leaf of the tree need not be assumed constant, and we show that such nonuniform leaf densities lead to more accurate density estimation. We have developed Bayesian network structure-learning algorithms that employ these tree-based conditional density representations, and we show that they can be used to practically learn complex joint probability models over dozens of continuous variables from thousands of datapoints. We focus on finding models that are simultaneously accurate, fast to learn, and fast to evaluate once they are learned.
On the Construction of the Inclusion Boundary Neighbourhood for Markov Equivalence Classes of Bayesian Network Structures
Auvray, Vincent, Wehenkel, Louis
The problem of learning Markov equivalence classes of Bayesian network structures may be solved by searching for the maximum of a scoring metric in a space of these classes. This paper deals with the definition and analysis of one such search space. We use a theoretically motivated neighbourhood, the inclusion boundary, and represent equivalence classes by essential graphs. We show that this search space is connected and that the score of the neighbours can be evaluated incrementally. We devise a practical way of building this neighbourhood for an essential graph that is purely graphical and does not explicitely refer to the underlying independences. We find that its size can be intractable, depending on the complexity of the essential graph of the equivalence class. The emphasis is put on the potential use of this space with greedy hill -climbing search
Convex Relaxations for Learning Bounded Treewidth Decomposable Graphs
Kumar, K. S. Sesh, Bach, Francis
We consider the problem of learning the structure of undirected graphical models with bounded treewidth, within the maximum likelihood framework. This is an NP-hard problem and most approaches consider local search techniques. In this paper, we pose it as a combinatorial optimization problem, which is then relaxed to a convex optimization problem that involves searching over the forest and hyperforest polytopes with special structures, independently. A supergradient method is used to solve the dual problem, with a run-time complexity of $O(k^3 n^{k+2} \log n)$ for each iteration, where $n$ is the number of variables and $k$ is a bound on the treewidth. We compare our approach to state-of-the-art methods on synthetic datasets and classical benchmarks, showing the gains of the novel convex approach.