Directed Networks
Is Shafer General Bayes?
This paper examines the relationship between Shafer's belief functions and convex sets of probability distributions. Kyburg's (1986) result showed that belief function models form a subset of the class of closed convex probability distributions. This paper emphasizes the importance of Kyburg's result by looking at simple examples involving Bernoulli trials. Furthermore, it is shown that many convex sets of probability distributions generate the same belief function in the sense that they support the same lower and upper values. This has implications for a decision theoretic extension. Dempster's rule of combination is also compared with Bayes' rule of conditioning.
Network Detection Theory and Performance
Smith, Steven T., Senne, Kenneth D., Philips, Scott, Kao, Edward K., Bernstein, Garrett
Network detection is an important capability in many areas of applied research in which data can be represented as a graph of entities and relationships. Oftentimes the object of interest is a relatively small subgraph in an enormous, potentially uninteresting background. This aspect characterizes network detection as a "big data" problem. Graph partitioning and network discovery have been major research areas over the last ten years, driven by interest in internet search, cyber security, social networks, and criminal or terrorist activities. The specific problem of network discovery is addressed as a special case of graph partitioning in which membership in a small subgraph of interest must be determined. Algebraic graph theory is used as the basis to analyze and compare different network detection methods. A new Bayesian network detection framework is introduced that partitions the graph based on prior information and direct observations. The new approach, called space-time threat propagation, is proved to maximize the probability of detection and is therefore optimum in the Neyman-Pearson sense. This optimality criterion is compared to spectral community detection approaches which divide the global graph into subsets or communities with optimal connectivity properties. We also explore a new generative stochastic model for covert networks and analyze using receiver operating characteristics the detection performance of both classes of optimal detection techniques.
Bayesian Compressed Regression
Guhaniyogi, Rajarshi, Dunson, David B.
As an alternative to variable selection or shrinkage in high dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors can be projected to a low dimensional linear subspace with minimal loss of information about the response. As opposed to existing Bayesian dimensionality reduction approaches, the exact posterior distribution conditional on the compressed data is available analytically, speeding up computation by many orders of magnitude while also bypassing robustness issues due to convergence and mixing problems with MCMC. Model averaging is used to reduce sensitivity to the random projection matrix, while accommodating uncertainty in the subspace dimension. Strong theoretical support is provided for the approach by showing near parametric convergence rates for the predictive density in the large p small n asymptotic paradigm. Practical performance relative to competitors is illustrated in simulations and real data applications.
Deep Gaussian Processes
Damianou, Andreas C., Lawrence, Neil D.
In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a deep belief network based on Gaussian process mappings. The data is modeled as the output of a multivariate GP. The inputs to that Gaussian process are then governed by another GP. A single layer model is equivalent to a standard GP or the GP latent variable model (GP-LVM). We perform inference in the model by approximate variational marginalization. This results in a strict lower bound on the marginal likelihood of the model which we use for model selection (number of layers and nodes per layer). Deep belief networks are typically applied to relatively large data sets using stochastic gradient descent for optimization. Our fully Bayesian treatment allows for the application of deep models even when data is scarce. Model selection by our variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.
ARCO1: An Application of Belief Networks to the Oil Market
Belief networks are a new, potentially important, class of knowledge-based models. ARCO1, currently under development at the Atlantic Richfield Company (ARCO) and the University of Southern California (USC), is the most advanced reported implementation of these models in a financial forecasting setting. ARCO1's underlying belief network models the variables believed to have an impact on the crude oil market. A pictorial market model-developed on a MAC II- facilitates consensus among the members of the forecasting team. The system forecasts crude oil prices via Monte Carlo analyses of the network. Several different models of the oil market have been developed; the system's ability to be updated quickly highlights its flexibility.
On the Generation of Alternative Explanations with Implications for Belief Revision
Department of Computer Science Brown University Providence, RI 02912 Abstract In general, the best explanation for a given observation makes no promises on how good it is with respect to other alternative explanations. A major deficiency of message-passing schemes for belief revision in Bayesian networks is their inability to generate alternatives beyond the second best. In this paper, we present a general approach based on linear constraint systems that naturally generates alternative explanations in an orderly and highly efficient manner. This approach is then applied to cost-based abduction problems as well as belief revision in Bayesian networks. INTRODUCTION We are constantly faced with the problem of explaining the observations we have gathered with our senses. Our explanations are constructed by assuming certain facts or hypotheses which support our observations. For example, suppose I decide to phone my friend Tony at the office.
From Relational Databases to Belief Networks
The relationship between belief networks and relational databases is examined. Based on this analysis, a method to construct belief networks automatically from statistical relational data is proposed. A comparison between our method and other methods shows that our method has several advantages when generalization or prediction is deeded.
Compressed Constraints in Probabilistic Logic and Their Revision
In probabilistic logic entailments, even moderate size problems can yield linear constraint systems with so many variables that exact methods are impractical. This difficulty can be remedied in many cases of interest by introducing a threevalued logic (true, false, and "don't care"). The three-valued approach allows the construction of "compressed" constraint systems which have the same solution sets as their two-valued counterparts, but which may involve dramatically fewer variables. PROLIFERATION OF WORLDS An entailment problem in Nilsson's (1986) probabilistic logic derives an estimate for the prior probability of one sentence (hereafter, the "target") from the priors for a set of other ("source") sentences. V is a matrix derived from an inventory of all consistent patterns of truth assignments (1 true, 0 false) for the source and target sentences.
Algorithms for Irrelevance-Based Partial MAPs
Irrelevance-based partial MAPs are useful constructs for domain-independent explanation using belief networks. We look at two definitions for such partial MAPs, and prove important properties that are useful in designing algorithms for computing them effectively. We make use of these properties in modifying our standard MAP best-first algorithm, so as to handle irrelevance-based partial MAPs.
A Fusion Algorithm for Solving Bayesian Decision Problems
This paper proposes a new method for solving Bayesian decision problems. The method consists of representing a Bayesian decision problem as a valuation-based system and applying a fusion algorithm for solving it. The fusion algorithm is a hybrid of local computational methods for computation of marginals of joint probability distributions and the local computational methods for discrete optimization problems.