Genre
Efficient Partial Order CDCL Using Assertion Level Choice Heuristics
Monnet, Anthony, Villemaire, Roger
We previously designed Partial Order Conflict Driven Clause Learning (PO-CDCL), a variation of the satisfiability solving CDCL algorithm with a partial order on decision levels, and showed that it can speed up the solving on problems with a high independence between decision levels. In this paper, we more thoroughly analyze the reasons of the efficiency of PO-CDCL. Of particular importance is that the partial order introduces several candidates for the assertion level. By evaluating different heuristics for this choice, we show that the assertion level selection has an important impact on solving and that a carefully designed heuristic can significantly improve performances on relevant benchmarks.
On the Geometry of Bayesian Graphical Models with Hidden Variables
Settimi, Raffaella, Smith, Jim Q.
In this paper we investigate the geometry of the likelihood of the unknown parameters in a simple class of Bayesian directed graphs with hidden variables. This enables us, before any numerical algorithms are employed, to obtain certain insights in the nature of the unidentifiability inherent in such models, the way posterior densities will be sensitive to prior densities and the typical geometrical form these posterior densities might take. Many of these insights carry over into more complicated Bayesian networks with systematic missing data.
Constructing Situation Specific Belief Networks
Mahoney, Suzanne M., Laskey, Kathryn Blackmond
This paper describes a process for constructing situation-specific belief networks from a knowledge base of network fragments. A situation-specific network is a minimal query complete network constructed from a knowledge base in response to a query for the probability distribution on a set of target variables given evidence and context variables. We present definitions of query completeness and situation-specific networks. We describe conditions on the knowledge base that guarantee query completeness. The relationship of our work to earlier work on KBMC is also discussed.
Hierarchical Solution of Markov Decision Processes using Macro-actions
Hauskrecht, Milos, Meuleau, Nicolas, Kaelbling, Leslie Pack, Dean, Thomas L., Boutilier, Craig
We investigate the use of temporally abstract actions, or macro-actions, in the solution of Markov decision processes. Unlike current models that combine both primitive actions and macro-actions and leave the state space unchanged, we propose a hierarchical model (using an abstract MDP) that works with macro-actions only, and that significantly reduces the size of the state space. This is achieved by treating macroactions as local policies that act in certain regions of state space, and by restricting states in the abstract MDP to those at the boundaries of regions. The abstract MDP approximates the original and can be solved more efficiently. We discuss several ways in which macro-actions can be generated to ensure good solution quality. Finally, we consider ways in which macro-actions can be reused to solve multiple, related MDPs; and we show that this can justify the computational overhead of macro-action generation.
Dealing with Uncertainty on the Initial State of a Petri Net
Jarkass, Iman, Rombaut, Michele
This paper proposes a method to find the actual state of a complex dynamic system from information coming from the sensors on the system himself, or on its environment. The nominal evolution of the system is a priori known and can be modeled (by an expert, for example), by different methods. In this paper, the Petri nets have been chosen. Contrary to the usual use of the Petri nets, the initial state of the system is unknown. So a degree of belief is bound to each places, or set of places. The theory used to model this uncertainty is the Dempster-Shafer's one which is well adapted to this type of problems. From the given Petri net characterizing the nominal evolution of the dynamic system, and from the observation inputs, the proposed method allows to determine according to the reliability of the model and the inputs, the state of the system at any time.
Mixture Representations for Inference and Learning in Boltzmann Machines
Lawrence, Neil D., Bishop, Christopher M., Jordan, Michael I.
Boltzmann machines are undirected graphical models with two-state stochastic variables, in which the logarithms of the clique potentials are quadratic functions of the node states. They have been widely studied in the neural computing literature, although their practical applicability has been limited by the difficulty of finding an effective learning algorithm. One well-established approach, known as mean field theory, represents the stochastic distribution using a factorized approximation. However, the corresponding learning algorithm often fails to find a good solution. We conjecture that this is due to the implicit uni-modality of the mean field approximation which is therefore unable to capture multi-modality in the true distribution. In this paper we use variational methods to approximate the stochastic distribution using multi-modal mixtures of factorized distributions. We present results for both inference and learning to demonstrate the effectiveness of this approach.
Large Deviation Methods for Approximate Probabilistic Inference
Kearns, Michael, Saul, Lawrence
We study two-layer belief networks of binary random variables in which the conditional probabilities Pr[childlparents] depend monotonically on weighted sums of the parents. In large networks where exact probabilistic inference is intractable, we show how to compute upper and lower bounds on many probabilities of interest. In particular, using methods from large deviation theory, we derive rigorous bounds on marginal probabilities such as Pr[children] and prove rates of convergence for the accuracy of our bounds as a function of network size. Our results apply to networks with generic transfer function parameterizations of the conditional probability tables, such as sigmoid and noisy-OR. They also explicitly illustrate the types of averaging behavior that can simplify the problem of inference in large networks.
Hierarchical Mixtures-of-Experts for Exponential Family Regression Models with Generalized Linear Mean Functions: A Survey of Approximation and Consistency Results
Jiang, Wenxin, Tanner, Martin A.
We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form psi(ga+fx^Tfgb) are mixed. Here psi(...) is the inverse link function. Suppose the true response y follows an exponential family regression model with mean function belonging to a class of smooth functions of the form psi(h(fx)) where h(...)in W_2^infty (a Sobolev class over [0,1]^{s}). It is shown that the HME probability density functions can approximate the true density, at a rate of O(m^{-2/s}) in L_p norm, and at a rate of O(m^{-4/s}) in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with no more than s-layers, where s is the dimension of the predictor fx. It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size n and the number of experts m both increase, the mean square error of the predicted mean response goes to zero. Conditions for such results to hold are stated and discussed.
Minimum Encoding Approaches for Predictive Modeling
Grunwald, Peter D, Kontkanen, Petri, Myllymaki, Petri, Silander, Tomi, Tirri, Henry
We analyze differences between two information-theoretically motivated approaches to statistical inference and model selection: the Minimum Description Length (MDL) principle, and the Minimum Message Length (MML) principle. Based on this analysis, we present two revised versions of MML: a pointwise estimator which gives the MML-optimal single parameter model, and a volumewise estimator which gives the MML-optimal region in the parameter space. Our empirical results suggest that with small data sets, the MDL approach yields more accurate predictions than the MML estimators. The empirical results also demonstrate that the revised MML estimators introduced here perform better than the original MML estimator suggested by Wallace and Freeman.
Graphical Models and Exponential Families
Geiger, Dan, Meek, Christopher
We provide a classification of graphical models according to their representation as subfamilies of exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical models and chain graphs with no hidden variables, including Bayesian networks with several families of local distributions, are curved exponential families (CEFs) and graphical models with hidden variables are stratified exponential families (SEFs). An SEF is a finite union of CEFs satisfying a frontier condition. In addition, we illustrate how one can automatically generate independence and non-independence constraints on the distributions over the observable variables implied by a Bayesian network with hidden variables. The relevance of these results for model selection is examined.