Goto

Collaborating Authors

 Uncertainty


A unifying representation for a class of dependent random measures

arXiv.org Machine Learning

We present a general construction for dependent random measures based on thinning Poisson processes on an augmented space. The framework is not restricted to dependent versions of a specific nonparametric model, but can be applied to all models that can be represented using completely random measures. Several existing dependent random measures can be seen as specific cases of this framework. Interesting properties of the resulting measures are derived and the efficacy of the framework is demonstrated by constructing a covariate-dependent latent feature model and topic model that obtain superior predictive performance.


Bayesian nonparametric models for ranked data

arXiv.org Machine Learning

We develop a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a gamma process. We derive a posterior characterization and a simple and effective Gibbs sampler for posterior simulation. We develop a time-varying extension of our model, and apply it to the New York Times lists of weekly bestselling books.


A Logic and Adaptive Approach for Efficient Diagnosis Systems using CBR

arXiv.org Artificial Intelligence

Case Based Reasoning (CBR) is an intelligent way of thinking based on experience and capitalization of already solved cases (source cases) to find a solution to a new problem (target case). Retrieval phase consists on identifying source cases that are similar to the target case. This phase may lead to erroneous results if the existing knowledge imperfections are not taken into account. This work presents a novel solution based on Fuzzy logic techniques and adaptation measures which aggregate weighted similarities to improve the retrieval results. To confirm the efficiency of our solution, we have applied it to the industrial diagnosis domain. The obtained results are more efficient results than those obtained by applying typical measures.


Sure independence screening in generalized linear models with NP-dimensionality

arXiv.org Machine Learning

Ultrahigh-dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. Among others, Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849-911] propose an independent screening framework by ranking the marginal correlations. They showed that the correlation ranking procedure possesses a sure independence screening property within the context of the linear model with Gaussian covariates and responses. In this paper, we propose a more general version of the independent learning with ranking the maximum marginal likelihood estimates or the maximum marginal likelihood itself in generalized linear models. We show that the proposed methods, with Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849-911] as a very special case, also possess the sure screening property with vanishing false selection rate. The conditions under which the independence learning possesses a sure screening is surprisingly simple. This justifies the applicability of such a simple method in a wide spectrum. We quantify explicitly the extent to which the dimensionality can be reduced by independence screening, which depends on the interactions of the covariance matrix of covariates and true parameters. Simulation studies are used to illustrate the utility of the proposed approaches. In addition, we establish an exponential inequality for the quasi-maximum likelihood estimator which is useful for high-dimensional statistical learning.


On the Prior and Posterior Distributions Used in Graphical Modelling

arXiv.org Machine Learning

Graphical model learning and inference are often performed using Bayesian techniques. In particular, learning is usually performed in two separate steps. First, the graph structure is learned from the data; then the parameters of the model are estimated conditional on that graph structure. While the probability distributions involved in this second step have been studied in depth, the ones used in the first step have not been explored in as much detail. In this paper, we will study the prior and posterior distributions defined over the space of the graph structures for the purpose of learning the structure of a graphical model. In particular, we will provide a characterisation of the behaviour of those distributions as a function of the possible edges of the graph. We will then use the properties resulting from this characterisation to define measures of structural variability for both Bayesian and Markov networks, and we will point out some of their possible applications.


Random Utility Theory for Social Choice

arXiv.org Machine Learning

A special case that has received significant attention is the Plackett-Luce model, for which fast inference methods for maximum likelihood estimators are available. This paper develops conditions on general random utility models that enable fast inference within a Bayesian framework through MC-EM, providing concave loglikelihood functions and bounded sets of global maxima solutions. Results on both real-world and simulated data provide support for the scalability of the approach and capability for model selection among general random utility models including Plackett-Luce.


Probabilistic Combination of Classifier and Cluster Ensembles for Non-transductive Learning

arXiv.org Machine Learning

Unsupervised models can provide supplementary soft constraints to help classify new target data under the assumption that similar objects in the target set are more likely to share the same class label. Such models can also help detect possible differences between training and target distributions, which is useful in applications where concept drift may take place. This paper describes a Bayesian framework that takes as input class labels from existing classifiers (designed based on labeled data from the source domain), as well as cluster labels from a cluster ensemble operating solely on the target data to be classified, and yields a consensus labeling of the target data. This framework is particularly useful when the statistics of the target data drift or change from those of the training data. We also show that the proposed framework is privacy-aware and allows performing distributed learning when data/models have sharing restrictions. Experiments show that our framework can yield superior results to those provided by applying classifier ensembles only.


A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

arXiv.org Artificial Intelligence

We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions. Previous work has focused on representing possible functions explicitly, which leads to a two-step procedure of first, doing inference over the function space and second, finding the extrema of these functions. Here we skip the representation step and directly model the distribution over extrema. To this end, we devise a non-parametric conjugate prior based on a kernel regressor. The resulting posterior distribution directly captures the uncertainty over the maximum of the unknown function. We illustrate the effectiveness of our model by optimizing a noisy, high-dimensional, non-convex objective function.


Dynamic Decision Support System Based on Bayesian Networks Application to fight against the Nosocomial Infections

arXiv.org Artificial Intelligence

The improvement of medical care quality is a significant interest for the future years. The fight against nosocomial infections (NI) in the intensive care units (ICU) is a good example. We will focus on a set of observations which reflect the dynamic aspect of the decision, result of the application of a Medical Decision Support System (MDSS). This system has to make dynamic decision on temporal data. We use dynamic Bayesian network (DBN) to model this dynamic process. It is a temporal reasoning within a real-time environment; we are interested in the Dynamic Decision Support Systems in healthcare domain (MDDSS).


Secured Wireless Communication using Fuzzy Logic based High Speed Public-Key Cryptography (FLHSPKC)

arXiv.org Artificial Intelligence

In this paper secured wireless communication using fuzzy logic based high speed public key cryptography (FLHSPKC) has been proposed by satisfying the major issues likes computational safety, power management and restricted usage of memory in wireless communication. Wireless Sensor Network (WSN) has several major constraints likes inadequate source of energy, restricted computational potentiality and limited memory. Though conventional Elliptic Curve Cryptography (ECC) which is a sort of public key cryptography used in wireless communication provides equivalent level of security like other existing public key algorithm using smaller parameters than other but this traditional ECC does not take care of all these major limitations in WSN. In conventional ECC consider Elliptic curve point p, an arbitrary integer k and modulus m, ECC carry out scalar multiplication kP mod m, which takes about 80% of key computation time on WSN. In this paper proposed FLHSPKC scheme provides some novel strategy including novel soft computing based strategy to speed up scalar multiplication in conventional ECC and which in turn takes shorter computational time and also satisfies power consumption restraint, limited usage of memory without hampering the security level. Performance analysis of the different strategies under FLHSPKC scheme and comparison study with existing conventional ECC methods has been done.