Bayesian Inference
Bayesian Prediction for Artificial Intelligence
Self, Matthew, Cheeseman, Peter
This paper shows that the common method used for making predictions under uncertainty in A1 and science is in error. This method is to use currently available data to select the best model from a given class of models-this process is called abduction-and then to use this model to make predictions about future data. The correct method requires averaging over all the models to make a prediction-we call this method transduction. Using transduction, an AI system will not give misleading results when basing predictions on small amounts of data, when no model is clearly best. For common classes of models we show that the optimal solution can be given in closed form.
Belief in Belief Functions: An Examination of Shafer's Canonical Examples
EXAMINATION OF SHAFER'S CANONICAL EXAMPLES Kathryn Blackmond Laskey Decision Science Consortium, Inc. 7700 Leesburg Pike, Suite 421 Falls Church, VA 22043 1 Abstract In the canonical examples underlying Shafer-Dempster theory, beliefs over the hypotheses of interest are derived from a probability model for a set of auxiliary hypotheses. Beliefs are derived via a compatibility relation connecting the auxiliary hypotheses to subsets of the primary hypotheses. A belief function differs from a Bayesian probability model in that one does not condition on those parts of the evidence for which no probabilities are specified. The significance of this difference in conditioning assumptions is illustrated with two examples giving rise to identical belief functions but different Bayesian probability distributions. Introduction The artificial intelligence community is in the midst of a lively debate over the representation and manipulation of uncertainty.
Is Shafer General Bayes?
This paper examines the relationship between Shafer's belief functions and convex sets of probability distributions. Kyburg's (1986) result showed that belief function models form a subset of the class of closed convex probability distributions. This paper emphasizes the importance of Kyburg's result by looking at simple examples involving Bernoulli trials. Furthermore, it is shown that many convex sets of probability distributions generate the same belief function in the sense that they support the same lower and upper values. This has implications for a decision theoretic extension. Dempster's rule of combination is also compared with Bayes' rule of conditioning.
Network Detection Theory and Performance
Smith, Steven T., Senne, Kenneth D., Philips, Scott, Kao, Edward K., Bernstein, Garrett
Network detection is an important capability in many areas of applied research in which data can be represented as a graph of entities and relationships. Oftentimes the object of interest is a relatively small subgraph in an enormous, potentially uninteresting background. This aspect characterizes network detection as a "big data" problem. Graph partitioning and network discovery have been major research areas over the last ten years, driven by interest in internet search, cyber security, social networks, and criminal or terrorist activities. The specific problem of network discovery is addressed as a special case of graph partitioning in which membership in a small subgraph of interest must be determined. Algebraic graph theory is used as the basis to analyze and compare different network detection methods. A new Bayesian network detection framework is introduced that partitions the graph based on prior information and direct observations. The new approach, called space-time threat propagation, is proved to maximize the probability of detection and is therefore optimum in the Neyman-Pearson sense. This optimality criterion is compared to spectral community detection approaches which divide the global graph into subsets or communities with optimal connectivity properties. We also explore a new generative stochastic model for covert networks and analyze using receiver operating characteristics the detection performance of both classes of optimal detection techniques.
Bayesian Compressed Regression
Guhaniyogi, Rajarshi, Dunson, David B.
As an alternative to variable selection or shrinkage in high dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors can be projected to a low dimensional linear subspace with minimal loss of information about the response. As opposed to existing Bayesian dimensionality reduction approaches, the exact posterior distribution conditional on the compressed data is available analytically, speeding up computation by many orders of magnitude while also bypassing robustness issues due to convergence and mixing problems with MCMC. Model averaging is used to reduce sensitivity to the random projection matrix, while accommodating uncertainty in the subspace dimension. Strong theoretical support is provided for the approach by showing near parametric convergence rates for the predictive density in the large p small n asymptotic paradigm. Practical performance relative to competitors is illustrated in simulations and real data applications.
On the Generation of Alternative Explanations with Implications for Belief Revision
Department of Computer Science Brown University Providence, RI 02912 Abstract In general, the best explanation for a given observation makes no promises on how good it is with respect to other alternative explanations. A major deficiency of message-passing schemes for belief revision in Bayesian networks is their inability to generate alternatives beyond the second best. In this paper, we present a general approach based on linear constraint systems that naturally generates alternative explanations in an orderly and highly efficient manner. This approach is then applied to cost-based abduction problems as well as belief revision in Bayesian networks. INTRODUCTION We are constantly faced with the problem of explaining the observations we have gathered with our senses. Our explanations are constructed by assuming certain facts or hypotheses which support our observations. For example, suppose I decide to phone my friend Tony at the office.
Compressed Constraints in Probabilistic Logic and Their Revision
In probabilistic logic entailments, even moderate size problems can yield linear constraint systems with so many variables that exact methods are impractical. This difficulty can be remedied in many cases of interest by introducing a threevalued logic (true, false, and "don't care"). The three-valued approach allows the construction of "compressed" constraint systems which have the same solution sets as their two-valued counterparts, but which may involve dramatically fewer variables. PROLIFERATION OF WORLDS An entailment problem in Nilsson's (1986) probabilistic logic derives an estimate for the prior probability of one sentence (hereafter, the "target") from the priors for a set of other ("source") sentences. V is a matrix derived from an inventory of all consistent patterns of truth assignments (1 true, 0 false) for the source and target sentences.
A Fusion Algorithm for Solving Bayesian Decision Problems
This paper proposes a new method for solving Bayesian decision problems. The method consists of representing a Bayesian decision problem as a valuation-based system and applying a fusion algorithm for solving it. The fusion algorithm is a hybrid of local computational methods for computation of marginals of joint probability distributions and the local computational methods for discrete optimization problems.
Dynamic Network Updating Techniques For Diagnostic Reasoning
A new probabilistic network construction system, DYNASTY, is proposed for diagnostic reasoning given variables whose probabilities change over time. Diagnostic reasoning is formulated as a sequential stochastic process, and is modeled using influence diagrams. Given a set O of observations, DYNASTY creates an influence diagram in order to devise the best action given O. Sensitivity analyses are conducted to determine if the best network has been created, given the uncertainty in network parameters and topology. DYNASTY uses an equivalence class approach to provide decision thresholds for the sensitivity analysis. This equivalence-class approach to diagnostic reasoning differentiates diagnoses only if the required actions are different. A set of network-topology updating algorithms are proposed for dynamically updating the network when necessary.
Representing Bayesian Networks within Probabilistic Horn Abduction
This paper presents a simple framework for Horn clause abduction, with probabilities associated with hypotheses. It is shown how this representation can represent any probabilistic knowledge representable in a Bayesian belief network. The main contributions are in finding a relationship between logical and probabilistic notions of evidential reasoning. This can be used as a basis for a new way to implement Bayesian Networks that allows for approximations to the value of the posterior probabilities, and also points to a way that Bayesian networks can be extended beyond a propositional language.