Bayesian Learning
Deriving a Minimal I-map of a Belief Network Relative to a Target Ordering of its Nodes
Matzkevich, Izhar, Abramson, Bruce
This paper identifies and solves a new optimization problem: Given a belief network (BN) and a target ordering on its variables, how can we efficiently derive its minimal I-map whose arcs are consistent with the target ordering? We present three solutions to this problem, all of which lead to directed acyclic graphs based on the original BN's recursive basis relative to the specified ordering (such a DAG is sometimes termed the boundary DAG drawn from the given BN relative to the said ordering [5]). Along the way, we also uncover an important general principal about arc reversals: when reordering a BN according to some target ordering, (while attempting to minimize the number of arcs generated), the sequence of arc reversals should follow the topological ordering induced by the original belief network's arcs to as great an extent as possible. These results promise to have a significant impact on the derivation of consensus models, as well as on other algorithms that require the reconfiguration and/or combination of BN's.
Some Complexity Considerations in the Combination of Belief Networks
Matzkevich, Izhar, Abramson, Bruce
One topic that is likely to attract an increasing amount of attention within the Knowledge-base systems resesearch community is the coordination of information provided by multiple experts. We envision a situation in which several experts independently encode information as belief networks. A potential user must then coordinate the conclusions and recommendations of these networks to derive some sort of consensus. One approach to such a consensus is the fusion of the contributed networks into a single, consensus model prior to the consideration of any case-specific data (specific observations, test results). This approach requires two types of combination procedures, one for probabilities, and one for graphs. Since the combination of probabilities is relatively well understood, the key barriers to this approach lie in the realm of graph theory. This paper provides formal definitions of some of the operations necessary to effect the necessary graphical combinations, and provides complexity analyses of these procedures. The paper's key result is that most of these operations are NPhard, and its primary message is that the derivation of "good" consensus networks must be done heuristically. to several general frameworks for knowledge bases, including production rules, frames, formal logic, and belief networks (BN's). It has also helped raise several topics that promise to become increasingly important in the next wave of research.
Sensitivity Analysis for Probability Assessments in Bayesian Networks
When eliciting probability models from experts, knowledge engineers may compare the results of the model with expert judgment on test scenarios, then adjust model parameters to bring the behavior of the model more in line with the expert's intuition. This paper presents a methodology for analytic computation of sensitivity values to measure the impact of small changes in a network parameter on a target probability value or distribution. These values can be used to guide knowledge elicitation. They can also be used in a gradient descent algorithm to estimate parameter values that maximize a measure of goodness-of-fit to both local and holistic probability assessments.
Parameter Adjustment in Bayes Networks. The generalized noisy OR-gate
Spiegelhalter and Lauritzen [15] studied sequential learning in Bayesian networks and proposed three models for the representation of conditional probabilities. A forth model, shown here, assumes that the parameter distribution is given by a product of Gaussian functions and updates them from the _ and _r messages of evidence propagation. We also generalize the noisy OR-gate for multivalued variables, develop the algorithm to compute probability in time proportional to the number of parents (even in networks with loops) and apply the learning model to this gate.
Additive Belief-Network Models
The inherent intractability of probabilistic inference has hindered the application of belief networks to large domains. Noisy ORgates [30] and probabilistic similarity networks [18, 17) escape the complexity of inference by restricting model expressiveness. Recent work in the application of belief-network models to time-series analysis and forecasting [9, 10) has given rise to the additive beliefnetwork model (ABNM). We (1) discuss the nature and implications of the approximations made by an additive decomposition of a belief network, (2) show greater efficiency in the induction of additive models when available data are scarce, (3) generalize probabilistic inference algorithms to exploit the additive decomposition of ABNMs, (4) show greater efficiency of inference, and (5) compare results on inference with a simple additive belief network.
Forecasting Sleep Apnea with Dynamic Network Models
Dynamic network models (DNMs) are belief networks for temporal reasoning. The DNM methodology combines techniques from time series analysis and probabilistic reasoning to provide (1) a knowledge representation that integrates noncontemporaneous and contemporaneous dependencies and (2) methods for iteratively refining these dependencies in response to the effects of exogenous influences. We use belief-network inference algorithms to perform forecasting, control, and discrete event simulation on DNMs. The belief network formulation allows us to move beyond the traditional assumptions of linearity in the relationships among time-dependent variables and of normality in their probability distributions. We demonstrate the DNM methodology on an important forecasting problem in medicine. We conclude with a discussion of how the methodology addresses several limitations found in traditional time series analyses.
On Considering Uncertainty and Alternatives in Low-Level Vision
LaValle, Steven M., Hutchinson, Seth A.
In this paper we address the uncertainty issues involved in the low-level vision task of image segmentation. Researchers in computer vision have worked extensively on this problem, in which the goal is to partition (or segment) an image into regions that are homogeneous or uniform in some sense. This segmentation is often utilized by some higher level process, such as an object recognition system. We show that by considering uncertainty in a Bayesian formalism, we can use statistical image models to build an approximate representation of a probability distribution over a space of alternative segmentations. We give detailed descriptions of the various levels of uncertainty associated with this problem, discuss the interaction of prior and posterior distributions, and provide the operations for constructing this representation.
End-User Construction of Influence Diagrams for Bayesian Statistics
Lehmann, Harold P., Shachter, Ross D.
Influence diagrams are ideal knowledge representations for Bayesian statistical models. However, these diagrams are difficult for end users to interpret and to manipulate. We present a user-based architecture that enables end users to create and to manipulate the knowledge representation. We use the problem of physicians' interpretation of two-arm parallel randomized clinical trials (TAPRCT) to illustrate the architecture and its use. There are three primary data structures. Elements of statistical models are encoded as subgraphs of a restricted class of influence diagram. The interpretations of those elements are mapped into users' language in a domain-specific, user-based semantic interface, called a patient-flow diagram, in the TAPRCT problem. Pennitted transformations of the statistical model that maintain the semantic relationships of the model are encoded in a metadata-state diagram, called the cohort-state diagram, in the TAPRCT problem. The algorithm that runs the system uses modular actions called construction steps. This framework has been implemented in a system called THOMAS, that allows physicians to interpret the data reported from a TAPRCT.
Tradeoffs in Constructing and Evaluating Temporal Influence Diagrams
This paper addresses the tradeoff's which need to be considered in reasoning using probabilistic network representations, such as Influence Diagrams (IDs). In particular, we examine the tradeoff's entailed in using Temporal Influence Diagrams (TIDs) which adequately capture the temporal evolution of a dynamic system without prohibitive data and computational requirements. Three approaches for TID construction which make different tradeoff's are examined: (1) tailoring the network at each time interval to the data available (rather then just copying the original Bayes Network for all time intervals); (2) modeling the evolution of a parsimonious subset of variables (rather than all variables); and (3) model selection approaches, which seek to minimize some measure of the predictive accuracy of the model without introducing too many parameters, which might cause "overfitting" of the model. Methods of evaluating the accuracy /efficiency of the tradeoff's are proposed.
A Study of Scaling Issues in Bayesian Belief Networks for Ship Classification
Musman, Scott A., Chang, L. W.
The problems associated with scaling involve active and challenging research topics in the area of artificial intelligence. The purpose is to solve real world problems by means of AI technologies, in cases where the complexity of representation of the real world problem is potentially combinatorial. In this paper, we present a novel approach to cope with the scaling issues in Bayesian belief networks for ship classification. The proposed approach divides the conceptual model of a complex ship classification problem into a set of small modules that work together to solve the classification problem while preserving the functionality of the original model. The possible ways of explaining sensor returns (e.g., the evidence) for some features, such as portholes along the length of a ship, are sometimes combinatorial. Thus, using an exhaustive approach, which entails the enumeration of all possible explanations, is impractical for larger problems. We present a network structure (referred to as Sequential Decomposition, SD) in which each observation is associated with a set of legitimate outcomes which are consistent with the explanation of each observed piece of evidence. The results show that the SD approach allows one to represent feature-observation relations in a manageable way and achieve the same explanatory power as an exhaustive approach.