AITopics

In this paper, we describe a scheme for propagating belief functions in certain kinds of trees using only local computations. This scheme generalizes the computational scheme proposed by Shafer and Logan1 for diagnostic trees of the type studied by Gordon and Shortliffe, and the slightly more general scheme given by Shafer for hierarchical evidence. It also generalizes the scheme proposed by Pearl for Bayesian causal trees (see Shenoy and Shafer). Pearl's causal trees and Gordon and Shortliffe's diagnostic trees are both ways of breaking the evidence that bears on a large problem down into smaller items of evidence that bear on smaller parts of the problem so that these smaller problems can be dealt with one at a time. This localization of effort is often essential in order to make the process of probability judgment feasible, both for the person who is making probability judgments and for the machine that is combining them. The basic structure for our scheme is a type of tree that generalizes both Pearl's and Gordon and Shortliffe's trees. Trees of this general type permit localized computation in Pearl's sense. They are based on qualitative judgments of conditional independence. We believe that the scheme we describe here will prove useful in expert systems. It is now clear that the successful propagation of probabilities or certainty factors in expert systems requires much more structure than can be provided in a pure production-system framework. Bayesian schemes, on the other hand, often make unrealistic demands for structure. The propagation of belief functions in trees and more general networks stands on a middle ground where some sensible and useful things can be done. We would like to emphasize that the basic idea of local computation for propagating probabilities is due to Judea Pearl. It is a very innovative idea; we do not believe that it can be found in the Bayesian literature prior to Pearl's work. We see our contribution as extending the usefulness of Pearl's idea by generalizing it from Bayesian probabilities to belief functions. In the next section, we give a brief introduction to belief functions. The notions of qualitative independence for partitions and a qualitative Markov tree are introduced in Section III. Finally, in Section IV, we describe a scheme for propagating belief functions in qualitative Markov trees.

bayesian inference, belief function, expert system, (18 more...)

1304.3109

Country: North America > United States > Kansas > Douglas County > Lawrence (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Probability Judgement in Artificial Intelligence

Shafer, Glenn

This paper is concerned with two theories of probability judgment: the Bayesian theory and the theory of belief functions. It illustrates these theories with some simple examples and discusses some of the issues that arise when we try to implement them in expert systems. The Bayesian theory is well known; its main ideas go back to the work of Thomas Bayes (1702-1761). The theory of belief functions, often called the Dempster-Shafer theory in the artificial intelligence community, is less well known, but it has even older antecedents; belief-function arguments appear in the work of George Hooper (16401723) and James Bernoulli (1654-1705). For elementary expositions of the theory of belief functions, see Shafer (1976, 1985).

artificial intelligence, bayesian inference, probability, (18 more...)

1304.3429

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Valuation-Based Systems for Discrete Optimization

Shenoy, Prakash P., Shafer, Glenn

This paper describes valuation-based systems for representing and solving discrete optimization problems. In valuation-based systems, we represent information in an optimization problem using variables, sample spaces of variables, a set of values, and functions that map sample spaces of sets of variables to the set of values. The functions, called valuations, represent the factors of an objective function. Solving the optimization problem involves using two operations called combination and marginalization. Combination tells us how to combine the factors of the joint objective function. Marginalization is either maximization or minimization. Solving an optimization problem can be simply described as finding the marginal of the joint objective function for the empty set. We state some simple axioms that combination and marginalization need to satisfy to enable us to solve an optimization problem using local computation. For optimization problems, the solution method of valuation-based systems reduces to non-serial dynamic programming. Thus our solution method for VBS can be regarded as an abstract description of dynamic programming. And our axioms can be viewed as conditions that permit the use of dynamic programming.

artificial intelligence, optimization problem, valuation, (18 more...)

1304.1121

Country: North America > United States > Kansas > Douglas County > Lawrence (0.17)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Modifiable Combining Functions

Cohen, Paul, Shafer, Glenn, Shenoy, Prakash P.

Modifiable combining functions are a synthesis of two common approaches to combining evidence. They offer many of the advantages of these approaches and avoid some disadvantages. Because they facilitate the acquisition, representation, explanation, and modification of knowledge about combinations of evidence, they are proposed as a tool for knowledge engineers who build systems that reason under uncertainty, not as a normative theory of evidence.

expert system, health & medicine, risk factor, (18 more...)

1304.2712

Country:

North America > United States > Massachusetts (0.14)
North America > United States > Kansas (0.14)

Industry:

Health & Medicine > Diagnostic Medicine (0.47)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Knowledge Management > Knowledge Engineering (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)

arXiv.org Artificial IntelligenceJan-11-2010

A betting interpretation for probabilities and Dempster-Shafer degrees of belief

Shafer, Glenn

There are at least two ways to interpret numerical degrees of belief in terms of betting: (1) you can offer to bet at the odds defined by the degrees of belief, or (2) you can judge that a strategy for taking advantage of such betting offers will not multiply the capital it risks by a large factor. Both interpretations can be applied to ordinary additive probabilities and used to justify updating by conditioning. Only the second can be applied to Dempster-Shafer degrees of belief and used to justify Dempster's rule of combination.

artificial intelligence, banking & finance, probability, (18 more...)

1001.1653

Country:

North America > United States > New Jersey (0.14)
North America > United States > California (0.14)

Industry:

Banking & Finance > Trading (0.46)
Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

arXiv.org Machine LearningJun-21-2007

A tutorial on conformal prediction

Shafer, Glenn, Vovk, Vladimir

Conformal prediction uses past experience to determine precise levels of confidence in new predictions. Given an error probability $\epsilon$, together with a method that makes a prediction $\hat{y}$ of a label $y$, it produces a set of labels, typically containing $\hat{y}$, that also contains $y$ with probability $1-\epsilon$. Conformal prediction can be applied to any method for producing $\hat{y}$: a nearest-neighbor method, a support-vector machine, ridge regression, etc. Conformal prediction is designed for an on-line setting in which labels are predicted successively, each one being revealed before the next is predicted. The most novel and valuable feature of conformal prediction is that if the successive examples are sampled independently from the same distribution, then the successive predictions will be right $1-\epsilon$ of the time, even though they are based on an accumulating dataset rather than on independent datasets. In addition to the model under which successive examples are sampled independently, other on-line compression models can also use conformal prediction. The widely used Gaussian linear model is one of these. This tutorial presents a self-contained account of the theory of conformal prediction and works through several numerical examples. A more comprehensive treatment of the topic is provided in "Algorithmic Learning in a Random World", by Vladimir Vovk, Alex Gammerman, and Glenn Shafer (Springer, 2005).

artificial intelligence, bayesian inference, prediction, (18 more...)

arXiv.org Machine Learning

0706.3188

Country:

North America > United States > New Jersey (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Leisure & Entertainment (0.45)
Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
(2 more...)

Neural Information Processing SystemsDec-31-2004

Self-calibrating Probability Forecasting

Vovk, Vladimir, Shafer, Glenn, Nouretdinov, Ilia

In the problem of probability forecasting the learner's goal is to output, given a training set and a new object, a suitable probability measure on the possible values of the new object's label. An online algorithm for probability forecasting is said to be well-calibrated if the probabilities it outputs agree with the observed frequencies. We give a natural nonasymptotic formalization of the notion of well-calibratedness, which we then study under the assumption of randomness (the object/label pairs are independent and identically distributed). It turns out that, although no probability forecasting algorithm is automatically well-calibrated in our sense, there exists a wide class of algorithms for "multiprobability forecasting" (such algorithms are allowed to output a set, ideally very narrow, of probability measures) which satisfy this property; we call the algorithms in this class "Venn probability machines". Our experimental results demonstrate that a 1-Nearest Neighbor Venn probability machine performs reasonably well on a standard benchmark data set, and one of our theoretical results asserts that a simple Venn probability machine asymptotically approaches the true conditional probabilities regardless, and without knowledge, of the true probability measure generating the examples.

artificial intelligence, bayesian inference, probability, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Neural Information Processing SystemsDec-31-2004

Self-calibrating Probability Forecasting

Vovk, Vladimir, Shafer, Glenn, Nouretdinov, Ilia

In the problem of probability forecasting the learner's goal is to output, given a training set and a new object, a suitable probability measure on the possible values of the new object's label. An online algorithm for probability forecasting is said to be well-calibrated if the probabilities it outputs agree with the observed frequencies. We give a natural nonasymptotic formalizationof the notion of well-calibratedness, which we then study under the assumption of randomness (the object/label pairs are independent and identically distributed). It turns out that, although no probability forecasting algorithm is automatically well-calibrated in our sense, there exists a wide class of algorithms for "multiprobability forecasting" (such algorithms are allowed to output a set, ideally very narrow, of probability measures) which satisfy this property; we call the algorithms in this class "Venn probability machines". Our experimental results demonstrate that a 1-Nearest Neighbor Venn probability machine performs reasonably well on a standard benchmark data set, and one of our theoretical results asserts that a simple Venn probability machine asymptotically approaches the true conditional probabilities regardless, and without knowledge, of the true probability measure generating the examples.

artificial intelligence, bayesian inference, probability, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)