Bayesian Inference
Constraint-based Approach to Discovery of Inter Module Dependencies in Modular Bayesian Networks
Oude, Patrick de (University of Amsterdam) | Pavlin, Gregor (Thales Research &)
This paper introduces an information theoretic approach to verification of modular causal probabilistic models. We assume systems which are gradually extended by adding new functional modules, each having a limited domain knowledge captured by a local Bayesian network. Different modules originate from independent design processes. We assume that the local models are correct, which, however does not guarantee globally coherent inference in composed systems. The introduced method supports discovery of significant inter module dependencies which are ignored in the assembled Bayesian network.
Join Tree Propagation Utilizing Both Arc Reversal and Variable Elimination
Butz, Cory James (University of Regina) | Konkel, Ken (University of Regina) | Lingras, Pawan (Saint Mary's University)
In this paper, we put forth the first join tree propagation algorithm that selectively applies either arc reversal (AR) or variable elimination (VE) to build the propagated messages. Our approach utilizes a recent method for identifying the propagated join tree messages \`{a} priori. When it is determined that precisely one message is to be constructed at a join tree node, VE is utilized to build this distribution; otherwise, AR is applied as it is better suited to construct multiple distributions passed between neighboring join tree nodes. Experimental results, involving evidence processing in seven real-world and one benchmark Bayesian network, empirically demonstrate that selectively applying VE and AR is faster than applying one of these methods exclusively on the entire network.
Conservative Inference Rule for Uncertain Reasoning under Incompleteness
In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the incompleteness process. We allow the process' behaviour to be partly unknown. Then we use Walley's theory of coherent lower previsions, a generalisation of the Bayesian theory to imprecision, to derive the rule to update beliefs under incompleteness that logically follows from our assumptions, and that we call conservative inference rule. This rule has some remarkable properties: it is an abstract rule to update beliefs that can be applied in any situation or domain; it gives us the opportunity to be neither too optimistic nor too pessimistic about the incompleteness process, which is a necessary condition to draw reliable while strong enough conclusions; and it is a coherent rule, in the sense that it cannot lead to inconsistencies. We give examples to show how the new rule can be applied in expert systems, in parametric statistical inference, and in pattern classification, and discuss more generally the view of incompleteness processes defended here as well as some of its consequences.
Learning Document-Level Semantic Properties from Free-Text Annotations
Branavan, S. R. K., Chen, H., Eisenstein, J., Barzilay, R.
This paper presents a new method for inferring the semantic properties of documents by leveraging free-text keyphrase annotations. Such annotations are becoming increasingly abundant due to the recent dramatic growth in semi-structured, user-generated online content. One especially relevant domain is product reviews, which are often annotated by their authors with pros/cons keyphrases such as ``a real bargain'' or ``good value.'' These annotations are representative of the underlying semantic properties; however, unlike expert annotations, they are noisy: lay authors may use different labels to denote the same property, and some labels may be missing. To learn using such noisy annotations, we find a hidden paraphrase structure which clusters the keyphrases. The paraphrase structure is linked with a latent topic model of the review texts, enabling the system to predict the properties of unannotated documents and to effectively aggregate the semantic properties of multiple reviews. Our approach is implemented as a hierarchical Bayesian model with joint inference. We find that joint inference increases the robustness of the keyphrase clustering and encourages the latent topics to correlate with semantically meaningful properties. Multiple evaluations demonstrate that our model substantially outperforms alternative approaches for summarizing single and multiple documents into a set of semantically salient keyphrases.
Lexicographic probability, conditional probability, and nonstandard probability
The relationship between Popper spaces (conditional probability spaces that satisfy some regularity conditions), lexicographic probability systems (LPS's), and nonstandard probability spaces (NPS's) is considered. If countable additivity is assumed, Popper spaces and a subclass of LPS's are equivalent; without the assumption of countable additivity, the equivalence no longer holds. If the state space is finite, LPS's are equivalent to NPS's. However, if the state space is infinite, NPS's are shown to be more general than LPS's.
On the Distribution of Penalized Maximum Likelihood Estimators: The LASSO, SCAD, and Thresholding
Potscher, Benedikt M., Leeb, Hannes
We study the distributions of the LASSO, SCAD, and thresholding estimators, in finite samples and in the large-sample limit. The asymptotic distributions are derived for both the case where the estimators are tuned to perform consistent model selection and for the case where the estimators are tuned to perform conservative model selection. Our findings complement those of Knight and Fu (2000) and Fan and Li (2001). We show that the distributions are typically highly nonnormal regardless of how the estimator is tuned, and that this property persists in large samples. The uniform convergence rate of these estimators is also obtained, and is shown to be slower than 1/root(n) in case the estimator is tuned to perform consistent model selection. An impossibility result regarding estimation of the estimators' distribution function is also provided.
Bayesian MAP Model Selection of Chain Event Graphs
The class of chain event graph models is a generalisation of the class of discrete Bayesian networks, retaining most of the structural advantages of the Bayesian network for model interrogation, propagation and learning, while more naturally encoding asymmetric state spaces and the order in which events happen. In this paper we demonstrate how with complete sampling, conjugate closed form model selection based on product Dirichlet priors is possible, and prove that suitable homogeneity assumptions characterise the product Dirichlet prior on this class of models. We demonstrate our techniques using two educational examples.
Definition of evidence fusion rules on the basis of Referee Functions
This chapter defines a new concept and framework for constructing fusion rules for evidences. This framework is based on a referee function, which does a decisional arbitrament conditionally to basic decisions provided by the several sources of information. A simple sampling method is derived from this framework. The purpose of this sampling approach is to avoid the combinatorics which are inherent to the definition of fusion rules of evidences. This definition of the fusion rule by the means of a sampling process makes possible the construction of several rules on the basis of an algorithmic implementation of the referee function, instead of a mathematical formulation. Incidentally, it is a versatile and intuitive way for defining rules. The framework is implemented for various well known evidence rules. On the basis of this framework, new rules for combining evidences are proposed, which takes into account a consensual evaluation of the sources of information.
Solving #SAT and Bayesian Inference with Backtracking Search
Bacchus, F., Dalmao, S., Pitassi, T.
Inference in Bayes Nets (BAYES) is an important problem with numerous applications in probabilistic reasoning. Counting the number of satisfying assignments of a propositional formula (#SAT) is a closely related problem of fundamental theoretical importance. Both these problems, and others, are members of the class of sum-of-products (SUMPROD) problems. In this paper we show that standard backtracking search when augmented with a simple memoization scheme (caching) can solve any sum-of-products problem with time complexity that is at least as good any other state-of-the-art exact algorithm, and that it can also achieve the best known time-space tradeoff. Furthermore, backtrackings ability to utilize more flexible variable orderings allows us to prove that it can achieve an exponential speedup over other standard algorithms for SUMPROD on some instances. The ideas presented here have been utilized in a number of solvers that have been applied to various types of sum-of-product problems. These systems have exploited the fact that backtracking can naturally exploit more of the problems structure to achieve improved performance on a range of probleminstances. Empirical evidence of this performance gain has appeared in published works describing these solvers, and we provide references to these works.