Bayesian Learning
The Myth of Modularity in Rule-Based Systems
Heckerman, David, Horvitz, Eric J.
In this paper, we examine the concept of modularity, an often cited advantage of the ruled-based representation methodology. We argue that the notion of modularity consists of two distinct concepts which we call syntactic modularity and semantic modularity. We argue that when reasoning under certainty, it is reasonable to regard the rule-based approach as both syntactically and semantically modular. However, we argue that in the case of plausible reasoning, rules are syntactically modular but are rarely semantically modular. To illustrate this point, we examine a particular approach for managing uncertainty in rule-based systems called the MYCIN certainty factor model. We formally define the concept of semantic modularity with respect to the certainty factor model and discuss logical consequences of the definition. We show that the assumption of semantic modularity imposes strong restrictions on rules in a knowledge base. We argue that such restrictions are rarely valid in practical applications. Finally, we suggest how the concept of semantic modularity can be relaxed in a manner that makes it appropriate for plausible reasoning.
Non-Monotonicity in Probabilistic Reasoning
We start by defining an approach to non-monotonic probabilistic reasoning in terms of non-monotonic categorical (true-false) reasoning. We identify a type of non-monotonic probabilistic reasoning, akin to default inheritance, that is commonly found in practice, especially in "evidential" and "Bayesian" reasoning. We formulate this in terms of the Maximization of Conditional Independence (MCI), and identify a variety of applications for this sort of default. We propose a formalization using Pointwise Circumscription. We compare MCI to Maximum Entropy, another kind of non-monotonic principle, and conclude by raising a number of open questions
Deriving And Combining Continuous Possibility Functions in the Framework of Evidential Reasoning
To develop an approach to utilizing continuous statistical information within the Dempster- Shafer framework, we combine methods proposed by Strat and by Shafero We first derive continuous possibility and mass functions from probability-density functions. Then we propose a rule for combining such evidence that is simpler and more efficiently computed than Dempster's rule. We discuss the relationship between Dempster's rule and our proposed rule for combining evidence over continuous frames.
Models vs. Inductive Inference for Dealing With Probabilistic Knowledge
Two different approaches to dealing with probabilistic knowledge are examined -models and inductive inference. Examples of the first are: influence diagrams [1], Bayesian networks [2], log-linear models [3, 4]. Examples of the second are: games-against nature [5, 6] varieties of maximum-entropy methods [7, 8, 9], and the author's min-score induction [10]. In the modeling approach, the basic issue is manageability, with respect to data elicitation and computation. Thus, it is assumed that the pertinent set of users in some sense knows the relevant probabilities, and the problem is to format that knowledge in a way that is convenient to input and store and that allows computation of the answers to current questions in an expeditious fashion. The basic issue for the inductive approach appears at first sight to be very different. In this approach it is presumed that the relevant probabilities are only partially known, and the problem is to extend that incomplete information in a reasonable way to answer current questions. Clearly, this approach requires that some form of induction be invoked. Of course, manageability is an important additional concern. Despite their seeming differences, the two approaches have a fair amount in common, especially with respect to the structural framework they employ. Roughly speaking, this framework involves identifying clusters of variables which strongly interact, establishing marginal probability distributions on the clusters, and extending the subdistributions to a more complete distribution, usually via a product formalism. The product extension is justified on the modeling approach in terms of assumed conditional independence; in the inductive approach the product form arises from an inductive rule.
Some Extensions of Probabilistic Logic
In [12], Nilsson proposed the probabilistic logic in which the truth values of logical propositions are probability values between 0 and 1. It is applicable to any logical system for which the consistency of a finite set of propositions can be established. The probabilistic inference scheme reduces to the ordinary logical inference when the probabilities of all propositions are either 0 or 1. This logic has the same limitations of other probabilistic reasoning systems of the Bayesian approach. For common sense reasoning, consistency is not a very natural assumption. We have some well known examples: {Dick is a Quaker, Quakers are pacifists, Republicans are not pacifists, Dick is a Republican}and {Tweety is a bird, birds can fly, Tweety is a penguin}. In this paper, we shall propose some extensions of the probabilistic logic. In the second section, we shall consider the space of all interpretations, consistent or not. In terms of frames of discernment, the basic probability assignment (bpa) and belief function can be defined. Dempster's combination rule is applicable. This extension of probabilistic logic is called the evidential logic in [ 1]. For each proposition s, its belief function is represented by an interval [Spt(s), Pls(s)]. When all such intervals collapse to single points, the evidential logic reduces to probabilistic logic (in the generalized version of not necessarily consistent interpretations). Certainly, we get Nilsson's probabilistic logic by further restricting to consistent interpretations. In the third section, we shall give a probabilistic interpretation of probabilistic logic in terms of multi-dimensional random variables. This interpretation brings the probabilistic logic into the framework of probability theory. Let us consider a finite set S = {sl, s2, ..., Sn) of logical propositions. Each proposition may have true or false values; and may be considered as a random variable. We have a probability distribution for each proposition. The e-dimensional random variable (sl,..., Sn) may take values in the space of all interpretations of 2n binary vectors. We may compute absolute (marginal), conditional and joint probability distributions. It turns out that the permissible probabilistic interpretation vector of Nilsson [12] consists of the joint probabilities of S. Inconsistent interpretations will not appear, by setting their joint probabilities to be zeros. By summing appropriate joint probabilities, we get probabilities of individual propositions or subsets of propositions. Since the Bayes formula and other techniques are valid for e-dimensional random variables, the probabilistic logic is actually very close to the Bayesian inference schemes. In the last section, we shall consider a relaxation scheme for probabilistic logic. In this system, not only new evidences will update the belief measures of a collection of propositions, but also constraint satisfaction among these propositions in the relational network will revise these measures. This mechanism is similar to human reasoning which is an evaluative process converging to the most satisfactory result. The main idea arises from the consistent labeling problem in computer vision. This method is originally applied to scene analysis of line drawings. Later, it is applied to matching, constraint satisfaction and multi sensor fusion by several authors [8], [16] (and see references cited there). Recently, this method is used in knowledge aggregation by Landy and Hummel [9].
Probabilistic Reasoning About Ship Images
Booker, Lashon B., Hota, Naveen
One of the most important aspects of current expert systems technology is the ability to make causal inferences about the impact of new evidence. When the domain knowledge and problem knowledge are uncertain and incomplete Bayesian reasoning has proven to be an effective way of forming such inferences [3,4,8]. While several reasoning schemes have been developed based on Bayes Rule, there has been very little work examining the comparative effectiveness of these schemes in a real application. This paper describes a knowledge based system for ship classification [1], originally developed using the PROSPECTOR updating method [2], that has been reimplemented to use the inference procedure developed by Pearl and Kim [4,5]. We discuss our reasons for making this change, the implementation of the new inference engine, and the comparative performance of the two versions of the system.
Taxonomy, Structure, and Implementation of Evidential Reasoning
The fundamental elements of evidential reasoning problems are described, followed by a discussion of the structure of various types of problems. Bayesian inference networks and state space formalism are used as the tool for problem representation. A human-oriented decision making cycle for solving evidential reasoning problems is described and illustrated for a military situation assessment problem. The implementation of this cycle may serve as the basis for an expert system shell for evidential reasoning; i.e. a situation assessment processor.
Knowledge Engineering Within A Generalized Bayesian Framework
Barth, Stephen W., Norton, Steven W.
During the ongoing debate over the representation of uncertainty in Artificial Intelligence, Cheeseman, Lemmer, Pearl, and others have argued that probability theory, and in particular the Bayesian theory, should be used as the basis for the inference mechanisms of Expert Systems dealing with uncertainty. In order to pursue the issue in a practical setting, sophisticated tools for knowledge engineering are needed that allow flexible and understandable interaction with the underlying knowledge representation schemes. This paper describes a Generalized Bayesian framework for building expert systems which function in uncertain domains, using algorithms proposed by Lemmer. It is neither rule-based nor frame-based, and requires a new system of knowledge engineering tools. The framework we describe provides a knowledge-based system architecture with an inference engine, explanation capability, and a unique aid for building consistent knowledge bases.
Efficient Inference on Generalized Fault Diagrams
Shachter, Ross D., Bertrand, Leonard
Ross D. Shachter and Leonard J. Bertrand Department of Engineering-Economic Systems, Stanford University (visiting the Center for Health Policy Research and Education, Duke University, PO Box GM, Durham, NC 27706) and Strategic Decisions Group, Menlo Park, CA for the Third Workshop on Uncertainty in Artificial Intelligence Seattle, Washington, July 10-12, 1987 The generalized fault diagram, a data structure for failure analysis based on the influence diagram, is defined. Unlike the fault tree, this structure allows for dependence among the basic events and replicated logical elements. A heuristic procedure is developed for efficient processing of these structures. Deterministic logic and conditional probabilities are both appealing frameworks in which to build a knowledge base. Each has a natural graphical representation, semantic network for logic and influence diagrams (Howard and Matheson, 1981) or bayes networks (Pearl, 1986) for probabilities.
Explanation of Probabilistic Inference for Decision Support Systems
This paper reports work in progress on an explanation facility for Bayesian conditioning aimed at improving user acceptance of probability-based decision support systems. Design of the facility, which appears to be reasonably domain-independent, is based on an information processing model that accounts both for biased and normative behavior in reasoning about conditional evidence. Preliminary results indicate that the facility is both acceptable to naive users and effective in improving understanding of Bayesian conditioning.