Goto

Collaborating Authors

 Uncertainty


Possibilistic logic bases and possibilistic graphs

arXiv.org Artificial Intelligence

Possibilistic logic bases and possibilistic graphs are two different frameworks of interest for representing knowledge. The former stratifies the pieces of knowledge (expressed by logical formulas) according to their level of certainty, while the latter exhibits relationships between variables. The two types of representations are semantically equivalent when they lead to the same possibility distribution (which rank-orders the possible interpretations). A possibility distribution can be decomposed using a chain rule which may be based on two different kinds of conditioning which exist in possibility theory (one based on product in a numerical setting, one based on minimum operation in a qualitative setting). These two types of conditioning induce two kinds of possibilistic graphs. In both cases, a translation of these graphs into possibilistic bases is provided. The converse translation from a possibilistic knowledge base into a min-based graph is also described.


An Application of Uncertain Reasoning to Requirements Engineering

arXiv.org Artificial Intelligence

This paper examines the use of Bayesian Networks to tackle one of the tougher problems in requirements engineering, translating user requirements into system requirements. The approach taken is to model domain knowledge as Bayesian Network fragments that are glued together to form a complete view of the domain specific system requirements. User requirements are introduced as evidence and the propagation of belief is used to determine what are the appropriate system requirements as indicated by user requirements. This concept has been demonstrated in the development of a system specification and the results are presented here.


A Temporal Bayesian Network for Diagnosis and Prediction

arXiv.org Artificial Intelligence

Diagnosis and prediction in some domains, like medical and industrial diagnosis, require a representation that combines uncertainty management and temporal reasoning. Based on the fact that in many cases there are few state changes in the temporal range of interest, we propose a novel representation called Temporal Nodes Bayesian Networks (TNBN). In a TNBN each node represents an event or state change of a variable, and an arc corresponds to a causal-temporal relationship. The temporal intervals can differ in number and size for each temporal node, so this allows multiple granularity. Our approach is contrasted with a dynamic Bayesian network for a simple medical example. An empirical evaluation is presented for a more complex problem, a subsystem of a fossil power plant, in which this approach is used for fault diagnosis and prediction with good results.


On the Product Rule for Classification Problems

arXiv.org Machine Learning

We discuss theoretical aspects of the product rule for classification problems in supervised machine learning for the case of combining classifiers. We show that (1) the product rule arises from the MAP classifier supposing equivalent priors and conditional independence given a class; (2) under some conditions, the product rule is equivalent to minimizing the sum of the squared distances to the respective centers of the classes related with different features, such distances being weighted by the spread of the classes; (3) observing some hypothesis, the product rule is equivalent to concatenating the vectors of features. With the advance of the Machine Learning field, and the discovery of many different techniques, the subject of combining multiple learners [2] eventually drove attention, in particular the problem of combining classifiers. Many different methods appeared, and soon they were compared in terms of efficiency in solving problems. The product rule has been present in some of these works (e.g., [1, 7, 3, 6, 5, 4, 8]), in contexts ranging from the accuracy of the different combination rules to some analytical properties of the different methods.


Non-parametric Bayesian modelling of digital gene expression data

arXiv.org Machine Learning

Next-generation sequencing technologies provide a revolutionary tool for generating gene expression data. Starting with a fixed RNA sample, they construct a library of millions of differentially abundant short sequence tags or "reads", which constitute a fundamentally discrete measure of the level of gene expression. A common limitation in experiments using these technologies is the low number or even absence of biological replicates, which complicates the statistical analysis of digital gene expression data. Analysis of this type of data has often been based on modified tests originally devised for analysing microarrays; both these and even de novo methods for the analysis of RNA-seq data are plagued by the common problem of low replication. We propose a novel, non-parametric Bayesian approach for the analysis of digital gene expression data. We begin with a hierarchical model for modelling over-dispersed count data and a blocked Gibbs sampling algorithm for inferring the posterior distribution of model parameters conditional on these counts. The algorithm compensates for the problem of low numbers of biological replicates by clustering together genes with tag counts that are likely sampled from a common distribution and using this augmented sample for estimating the parameters of this distribution. The number of clusters is not decided a priori, but it is inferred along with the remaining model parameters. We demonstrate the ability of this approach to model biological data with high fidelity by applying the algorithm on a public dataset obtained from cancerous and non-cancerous neural tissues.


Follow the Leader If You Can, Hedge If You Must

arXiv.org Machine Learning

Follow-the-Leader (FTL) is an intuitive sequential prediction strategy that guarantees constant regret in the stochastic setting, but has terrible performance for worst-case data. Other hedging strategies have better worst-case guarantees but may perform much worse than FTL if the data are not maximally adversarial. We introduce the FlipFlop algorithm, which is the first method that provably combines the best of both worlds. As part of our construction, we develop AdaHedge, which is a new way of dynamically tuning the learning rate in Hedge without using the doubling trick. AdaHedge refines a method by Cesa-Bianchi, Mansour and Stoltz (2007), yielding slightly improved worst-case guarantees. By interleaving AdaHedge and FTL, the FlipFlop algorithm achieves regret within a constant factor of the FTL regret, without sacrificing AdaHedge's worst-case guarantees. AdaHedge and FlipFlop do not need to know the range of the losses in advance; moreover, unlike earlier methods, both have the intuitive property that the issued weights are invariant under rescaling and translation of the losses. The losses are also allowed to be negative, in which case they may be interpreted as gains.


Game Networks

arXiv.org Artificial Intelligence

We introduce Game networks (G nets), a novel representation for multi-agent decision problems. Compared to other game-theoretic representations, such as strategic or extensive forms, G nets are more structured and more compact; more fundamentally, G nets constitute a computationally advantageous framework for strategic inference, as both probability and utility independencies are captured in the structure of the network and can be exploited in order to simplify the inference process. An important aspect of multi-agent reasoning is the identification of some or all of the strategic equilibria in a game; we present original convergence methods for strategic equilibrium which can take advantage of strategic separabilities in the G net structure in order to simplify the computations. Specifically, we describe a method which identifies a unique equilibrium as a function of the game payoffs, and one which identifies all equilibria.


Monte Carlo Inference via Greedy Importance Sampling

arXiv.org Machine Learning

We present a new method for conducting Monte Carlo inference in graphical models which combines explicit search with generalized importance sampling. The idea is to reduce the variance of importance sampling by searching for significant points in the target distribution. We prove that it is possible to introduce search and still maintain unbiasedness. We then demonstrate our procedure on a few simple inference tasks and show that it can improve the inference quality of standard MCMC methods, including Gibbs sampling, Metropolis sampling, and Hybrid Monte Carlo. This paper extends previous work which showed how greedy importance sampling could be correctly realized in the one-dimensional case.


Being Bayesian about Network Structure

arXiv.org Machine Learning

In many domains, we are interested in analyzing the structure of the underlying distribution, e.g., whether one variable is a direct parent of the other. Bayesian model-selection attempts to find the MAP model and use its structure to answer these questions. However, when the amount of available data is modest, there might be many models that have non-negligible posterior. Thus, we want compute the Bayesian posterior of a feature, i.e., the total posterior probability of all models that contain it. In this paper, we propose a new approach for this task. We first show how to efficiently compute a sum over the exponential number of networks that are consistent with a fixed ordering over network variables. This allows us to compute, for a given ordering, both the marginal probability of the data and the posterior of a feature. We then use this result as the basis for an algorithm that approximates the Bayesian posterior of a feature. Our approach uses a Markov Chain Monte Carlo (MCMC) method, but over orderings rather than over network structures. The space of orderings is much smaller and more regular than the space of structures, and has a smoother posterior `landscape'. We present empirical results on synthetic and real-life datasets that compare our approach to full model averaging (when possible), to MCMC over network structures, and to a non-Bayesian bootstrap approach.


Gaussian Process Networks

arXiv.org Machine Learning

In this paper we address the problem of learning the structure of a Bayesian network in domains with continuous variables. This task requires a procedure for comparing different candidate structures. In the Bayesian framework, this is done by evaluating the {em marginal likelihood/} of the data given a candidate structure. This term can be computed in closed-form for standard parametric families (e.g., Gaussians), and can be approximated, at some computational cost, for some semi-parametric families (e.g., mixtures of Gaussians). We present a new family of continuous variable probabilistic networks that are based on {em Gaussian Process/} priors. These priors are semi-parametric in nature and can learn almost arbitrary noisy functional relations. Using these priors, we can directly compute marginal likelihoods for structure learning. The resulting method can discover a wide range of functional dependencies in multivariate data. We develop the Bayesian score of Gaussian Process Networks and describe how to learn them from data. We present empirical results on artificial data as well as on real-life domains with non-linear dependencies.