Learning Graphical Models
Time-Critical Reasoning: Representations and Application
Horvitz, Eric J., Seiver, Adam
We review the problem of time-critical action and discuss a reformulation that shifts knowledge acquisition from the assessment of complex temporal probabilistic dependencies to the direct assessment of time-dependent utilities over key outcomes of interest. We dwell on a class of decision problems characterized by the centrality of diagnosing and reacting in a timely manner to pathological processes. We motivate key ideas in the context of trauma-care triage and transportation decisions.
Probability Update: Conditioning vs. Cross-Entropy
Grove, Adam J., Halpern, Joseph Y.
Conditioning is the generally agreed-upon method for updating probability distributions when one learns that an event is certainly true. But it has been argued that we need other rules, in particular the rule of cross-entropy minimization, to handle updates that involve uncertain information. In this paper we re-examine such a case: van Fraassen's Judy Benjamin problem, which in essence asks how one might update given the value of a conditional probability. We argue that -- contrary to the suggestions in the literature -- it is possible to use simple conditionalization in this case, and thereby obtain answers that agree fully with intuition. This contrasts with proposals such as cross-entropy, which are easier to apply but can give unsatisfactory answers. Based on the lessons from this example, we speculate on some general philosophical issues concerning probability update.
Image Segmentation in Video Sequences: A Probabilistic Approach
Friedman, Nir, Russell, Stuart
"Background subtraction" is an old technique for finding moving objects in a video sequence for example, cars driving on a freeway. The idea is that subtracting the current image from a timeaveraged background image will leave only nonstationary objects. It is, however, a crude approximation to the task of classifying each pixel of the current image; it fails with slow-moving objects and does not distinguish shadows from moving objects. The basic idea of this paper is that we can classify each pixel using a model of how that pixel looks when it is part of different classes. We learn a mixture-of-Gaussians classification model for each pixel using an unsupervised technique- an efficient, incremental version of EM. Unlike the standard image-averaging approach, this automatically updates the mixture component for each class according to likelihood of membership; hence slow-moving objects are handled perfectly. Our approach also identifies and eliminates shadows much more effectively than other techniques such as thresholding. Application of this method as part of the Roadwatch traffic surveillance project is expected to result in significant improvements in vehicle identification and tracking.
Sequential Update of Bayesian Network Structure
Friedman, Nir, Goldszmidt, Moises
There is an obvious need for improving the performance and accuracy of a Bayesian network as new data is observed. Because of errors in model construction and changes in the dynamics of the domains, we cannot afford to ignore the information in new data. While sequential update of parameters for a fixed structure can be accomplished using standard techniques, sequential update of network structure is still an open problem. In this paper, we investigate sequential update of Bayesian networks were both parameters and structure are expected to change. We introduce a new approach that allows for the flexible manipulation of the tradeoff between the quality of the learned networks and the amount of information that is maintained about past observations. We formally describe our approach including the necessary modifications to the scoring functions for learning Bayesian networks, evaluate its effectiveness through and empirical study, and extend it to the case of missing data.
Myopic Value of Information in Influence Diagrams
Dittmer, Soren L., Jensen, Finn Verner
We present a method for calculation of myopic value of information in influence diagrams (Howard & Matheson, 1981) based on the strong junction tree framework (Jensen, Jensen & Dittmer, 1994). The difference in instantiation order in the influence diagrams is reflected in the corresponding junction trees by the order in which the chance nodes are marginalized. This order of marginalization can be changed by table expansion and in effect the same junction tree with expanded tables may be used for calculating the expected utility for scenarios with different instantiation order. We also compare our method to the classic method of modeling different instantiation orders in the same influence diagram.
A Standard Approach for Optimizing Belief Network Inference using Query DAGs
Darwiche, Adnan, Provan, Gregory M.
This paper proposes a novel, algorithm-independent approach to optimizing belief network inference. rather than designing optimizations on an algorithm by algorithm basis, we argue that one should use an unoptimized algorithm to generate a Q-DAG, a compiled graphical representation of the belief network, and then optimize the Q-DAG and its evaluator instead. We present a set of Q-DAG optimizations that supplant optimizations designed for traditional inference algorithms, including zero compression, network pruning and caching. We show that our Q-DAG optimizations require time linear in the Q-DAG size, and significantly simplify the process of designing algorithms for optimizing belief network inference.
Robustness Analysis of Bayesian Networks with Local Convex Sets of Distributions
Robust Bayesian inference is the calculation of posterior probability bounds given perturbations in a probabilistic model. This paper focuses on perturbations that can be expressed locally in Bayesian networks through convex sets of distributions. Two approaches for combination of local models are considered. The first approach takes the largest set of joint distributions that is compatible with the local sets of distributions; we show how to reduce this type of robust inference to a linear programming problem. The second approach takes the convex hull of joint distributions generated from the local sets of distributions; we demonstrate how to apply interior-point optimization methods to generate posterior bounds and how to generate approximations that are guaranteed to converge to correct posterior bounds. We also discuss calculation of bounds for expected utilities and variances, and global perturbation models.
Exploring Parallelism in Learning Belief Networks
It has been shown that a class of probabilistic domain models cannot be learned correctly by several existing algorithms which employ a single-link look ahead search. When a multi-link look ahead search is used, the computational complexity of the learning algorithm increases. We study how to use parallelism to tackle the increased complexity in learning such models and to speed up learning in large domains. An algorithm is proposed to decompose the learning task for parallel processing. A further task decomposition is used to balance load among processors and to increase the speed-up and efficiency. For learning from very large datasets, we present a regrouping of the available processors such that slow data access through file can be replaced by fast memory access. Our implementation in a parallel computer demonstrates the effectiveness of the algorithm.
Structured Arc Reversal and Simulation of Dynamic Probabilistic Networks
Cheuk, Adrian Y. W., Boutilier, Craig
We present an algorithm for arc reversal in Bayesian networks with tree-structured conditional probability tables, and consider some of its advantages, especially for the simulation of dynamic probabilistic networks. In particular, the method allows one to produce CPTs for nodes involved in the reversal that exploit regularities in the conditional distributions. We argue that this approach alleviates some of the overhead associated with arc reversal, plays an important role in evidence integration and can be used to restrict sampling of variables in DPNs. We also provide an algorithm that detects the dynamic irrelevance of state variables in forward simulation. This algorithm exploits the structured CPTs in a reversed network to determine, in a time-independent fashion, the conditions under which a variable does or does not need to be sampled.
Defining Explanation in Probabilistic Systems
Chajewska, Urszula, Halpern, Joseph Y.
As probabilistic systems gain popularity and are coming into wider use, the need for a mechanism that explains the system's findings and recommendations becomes more critical. The system will also need a mechanism for ordering competing explanations. We examine two representative approaches to explanation in the literature - one due to G\"ardenfors and one due to Pearl - and show that both suffer from significant problems. We propose an approach to defining a notion of "better explanation" that combines some of the features of both together with more recent work by Pearl and others on causality.