"The field of Machine Learning seeks to answer these questions: How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?"
– from The Discipline of Machine Learning by Tom Mitchell. CMU-ML-06-108, 2006.
Spiegelhalter and Lauritzen  studied sequential learning in Bayesian networks and proposed three models for the representation of conditional probabilities. A forth model, shown here, assumes that the parameter distribution is given by a product of Gaussian functions and updates them from the _ and _r messages of evidence propagation. We also generalize the noisy OR-gate for multivalued variables, develop the algorithm to compute probability in time proportional to the number of parents (even in networks with loops) and apply the learning model to this gate.
The inherent intractability of probabilistic inference has hindered the application of belief networks to large domains. Noisy OR-gates  and probabilistic similarity networks [18, 17] escape the complexity of inference by restricting model expressiveness. Recent work in the application of belief-network models to time-series analysis and forecasting [9, 10] has given rise to the additive belief network model (ABNM). We (1) discuss the nature and implications of the approximations made by an additive decomposition of a belief network, (2) show greater efficiency in the induction of additive models when available data are scarce, (3) generalize probabilistic inference algorithms to exploit the additive decomposition of ABNMs, (4) show greater efficiency of inference, and (5) compare results on inference with a simple additive belief network.
Dynamic network models (DNMs) are belief networks for temporal reasoning. The DNM methodology combines techniques from time series analysis and probabilistic reasoning to provide (1) a knowledge representation that integrates noncontemporaneous and contemporaneous dependencies and (2) methods for iteratively refining these dependencies in response to the effects of exogenous influences. We use belief-network inference algorithms to perform forecasting, control, and discrete event simulation on DNMs. The belief network formulation allows us to move beyond the traditional assumptions of linearity in the relationships among time-dependent variables and of normality in their probability distributions. We demonstrate the DNM methodology on an important forecasting problem in medicine. We conclude with a discussion of how the methodology addresses several limitations found in traditional time series analyses.
PAGODA (Probabilistic Autonomous Goal-Directed Agent) is a model for autonomous learning in probabilistic domains [desJardins, 1992] that incorporates innovative techniques for using the agent's existing knowledge to guide and constrain the learning process and for representing, reasoning with, and learning probabilistic knowledge. This paper describes the probabilistic representation and inference mechanism used in PAGODA. PAGODA forms theories about the effects of its actions and the world state on the environment over time. These theories are represented as conditional probability distributions. A restriction is imposed on the structure of the theories that allows the inference mechanism to find a unique predicted distribution for any action and world state description. These restricted theories are called uniquely predictive theories. The inference mechanism, Probability Combination using Independence (PCI), uses minimal independence assumptions to combine the probabilities in a theory to make probabilistic predictions.
We present a mechanism for constructing graphical models, specifically Bayesian networks, from a knowledge base of general probabilistic information. The unique feature of our approach is that it uses a powerful first-order probabilistic logic for expressing the general knowledge base. This logic allows for the representation of a wide range of logical and probabilistic information. The model construction procedure we propose uses notions from direct inference to identify pieces of local statistical information from the knowledge base that are most appropriate to the particular event we want to reason about. These pieces are composed to generate a joint probability distribution specified as a Bayesian network. Although there are fundamental difficulties in dealing with fully general knowledge, our procedure is practical for quite rich knowledge bases and it supports the construction of a far wider range of networks than allowed for by current template technology.
The Noisy-Or model is convenient for describing a class of uncertain relationships in Bayesian networks [Pearl 1988]. Pearl describes the Noisy-Or model for Boolean variables. Here we generalize the model to nary input and output variables and to arbitrary functions other than the Boolean OR function. This generalization is a useful modeling aid for construction of Bayesian networks. We illustrate with some examples including digital circuit diagnosis and network reliability analysis.
Relevance-based explanation is a scheme in which partial assignments to Bayesian belief network variables are explanations (abductive conclusions). We allow variables to remain unassigned in explanations as long as they are irrelevant to the explanation, where irrelevance is defined in terms of statistical independence. When multiple-valued variables exist in the system, especially when subsets of values correspond to natural types of events, the over specification problem, alleviated by independence-based explanation, resurfaces. As a solution to that, as well as for addressing the question of explanation specificity, it is desirable to collapse such a subset of values into a single value on the fly. The equivalent method, which is adopted here, is to generalize the notion of assignments to allow disjunctive assignments. We proceed to define generalized independence based explanations as maximum posterior probability independence based generalized assignments (GIB-MAPs). GIB assignments are shown to have certain properties that ease the design of algorithms for computing GIB-MAPs. One such algorithm is discussed here, as well as suggestions for how other algorithms may be adapted to compute GIB-MAPs. GIB-MAP explanations still suffer from instability, a problem which may be addressed using ?approximate? conditional independence as a condition for irrelevance.
Problems of probabilistic inference and decision making under uncertainty commonly involve continuous random variables. Often these are discretized to a few points, to simplify assessments and computations. An alternative approximation is to fit analytically tractable continuous probability distributions. This approach has potential simplicity and accuracy advantages, especially if variables can be transformed first. This paper shows how a minimum relative entropy criterion can drive both transformation and fitting, illustrating with a power and logarithm family of transformations and mixtures of Gaussian (normal) distributions, which allow use of efficient influence diagram methods. The fitting procedure in this case is the well-known EM algorithm. The selection of the number of components in a fitted mixture distribution is automated with an objective that trades off accuracy and computational cost.
This paper identifies and solves a new optimization problem: Given a belief network (BN) and a target ordering on its variables, how can we efficiently derive its minimal I-map whose arcs are consistent with the target ordering? We present three solutions to this problem, all of which lead to directed acyclic graphs based on the original BN's recursive basis relative to the specified ordering (such a DAG is sometimes termed the boundary DAG drawn from the given BN relative to the said ordering ). Along the way, we also uncover an important general principal about arc reversals: when reordering a BN according to some target ordering, (while attempting to minimize the number of arcs generated), the sequence of arc reversals should follow the topological ordering induced by the original belief network's arcs to as great an extent as possible. These results promise to have a significant impact on the derivation of consensus models, as well as on other algorithms that require the reconfiguration and/or combination of BN's.
In this paper we address the uncertainty issues involved in the low-level vision task of image segmentation. Researchers in computer vision have worked extensively on this problem, in which the goal is to partition (or segment) an image into regions that are homogeneous or uniform in some sense. This segmentation is often utilized by some higher level process, such as an object recognition system. We show that by considering uncertainty in a Bayesian formalism, we can use statistical image models to build an approximate representation of a probability distribution over a space of alternative segmentations. We give detailed descriptions of the various levels of uncertainty associated with this problem, discuss the interaction of prior and posterior distributions, and provide the operations for constructing this representation.