Goto

Collaborating Authors

 Uncertainty


A Minimax Optimal Algorithm for Crowdsourcing

arXiv.org Machine Learning

We consider the problem of accurately estimating the reliability of workers based on noisy labels they provide, which is a fundamental question in crowdsourcing. We propose a novel lower bound on the minimax estimation error which applies to any estimation procedure. We further propose Triangular Estimation (TE), an algorithm for estimating the reliability of workers. TE has low complexity, may be implemented in a streaming setting when labels are provided by workers in real time, and does not rely on an iterative procedure. We further prove that TE is minimax optimal and matches our lower bound. We conclude by assessing the performance of TE and other state-of-the-art algorithms on both synthetic and real-world data sets.


Switched latent force models for reverse-engineering transcriptional regulation in gene expression data

arXiv.org Machine Learning

To survive environmental conditions, cells transcribe their response activities into encoded mRNA sequences in order to produce certain amounts of protein concentrations. The external conditions are mapped into the cell through the activation of special proteins called transcription factors (TFs). Due to the difficult task to measure experimentally TF behaviours, and the challenges to capture their quick-time dynamics, different types of models based on differential equations have been proposed. However, those approaches usually incur in costly procedures, and they present problems to describe sudden changes in TF regulators. In this paper, we present a switched dynamical latent force model for reverse-engineering transcriptional regulation in gene expression data which allows the exact inference over latent TF activities driving some observed gene expressions through a linear differential equation. To deal with discontinuities in the dynamics, we introduce an approach that switches between different TF activities and different dynamical systems. This creates a versatile representation of transcription networks that can capture discrete changes and non-linearities We evaluate our model on both simulated data and real-data (e.g. microaerobic shift in E. coli, yeast respiration), concluding that our framework allows for the fitting of the expression data while being able to infer continuous-time TF profiles.


Hierarchical compositional feature learning

arXiv.org Artificial Intelligence

We introduce the hierarchical compositional network (HCN), a directed generative model able to discover and disentangle, without supervision, the building blocks of a set of binary images. The building blocks are binary features defined hierarchically as a composition of some of the features in the layer immediately below, arranged in a particular manner. At a high level, HCN is similar to a sigmoid belief network with pooling. Inference and learning in HCN are very challenging and existing variational approximations do not work satisfactorily. A main contribution of this work is to show that both can be addressed using max-product message passing (MPMP) with a particular schedule (no EM required). Also, using MPMP as an inference engine for HCN makes new tasks simple: adding supervision information, classifying images, or performing inpainting all correspond to clamping some variables of the model to their known values and running MPMP on the rest. When used for classification, fast inference with HCN has exactly the same functional form as a convolutional neural network (CNN) with linear activations and binary weights. However, HCN's features are qualitatively very different.


Conformal predictive distributions with kernels

arXiv.org Machine Learning

Prediction is a fundamental and difficult scientific problem. We limit the scope of our discussion by imposing, from the outset, two restrictions: we only want to predict one real number y R, and we want our prediction to satisfy a reasonable property of validity (under a natural assumption). It can be argued that the fullest prediction for y is a probability measure on R, which can be represented by its distribution function: see, e.g., [5, 6, 8]. We will refer to it as the predictive distribution. A standard property of validity for predictive distributions is being well-calibrated.


A Bayesian Method for Joint Clustering of Vectorial Data and Network Data

arXiv.org Machine Learning

We present a new model-based integrative method for clustering objects given both vectorial data, which describes the feature of each object, and network data, which indicates the similarity of connected objects. The proposed general model is able to cluster the two types of data simultaneously within one integrative probabilistic model, while traditional methods can only handle one data type or depend on transforming one data type to another. Bayesian inference of the clustering is conducted based on a Markov chain Monte Carlo algorithm. A special case of the general model combining the Gaussian mixture model and the stochastic block model is extensively studied. We used both synthetic data and real data to evaluate this new method and compare it with alternative methods. The results show that our simultaneous clustering method performs much better. This improvement is due to the power of the model-based probabilistic approach for efficiently integrating information.


Markov Properties for Graphical Models with Cycles and Latent Variables

arXiv.org Machine Learning

We investigate probabilistic graphical models that allow for both cycles and latent variables. For this we introduce directed graphs with hyperedges (HEDGes), generalizing and combining both marginalized directed acyclic graphs (mDAGs) that can model latent (dependent) variables, and directed mixed graphs (DMGs) that can model cycles. We define and analyse several different Markov properties that relate the graphical structure of a HEDG with a probability distribution on a corresponding product space over the set of nodes, for example factorization properties, structural equations properties, ordered/local/global Markov properties, and marginal versions of these. The various Markov properties for HEDGes are in general not equivalent to each other when cycles or hyperedges are present, in contrast with the simpler case of directed acyclic graphical (DAG) models (also known as Bayesian networks). We show how the Markov properties for HEDGes - and thus the corresponding graphical Markov models - are logically related to each other.


A Correction Method of a Binary Classifier Applied to Multi-label Pairwise Models

arXiv.org Machine Learning

In this work, we addressed the issue of applying a stochastic classifier and a local, fuzzy confusion matrix under the framework of multi-label classification. We proposed a novel solution to the problem of correcting label pairwise ensembles. The main step of the correction procedure is to compute classifier- specific competence and cross-competence measures, which estimates error pattern of the underlying classifier. We considered two improvements of the method of obtaining confusion matrices. The first one is aimed to deal with imbalanced labels. The other utilizes double labelled instances which are usually removed during the pairwise transformation. The proposed methods were evaluated using 29 benchmark datasets. In order to assess the efficiency of the introduced models, they were compared against 1 state-of-the-art approach and the correction scheme based on the original method of confusion matrix estimation. The comparison was performed using four different multi-label evaluation measures: macro and micro-averaged F1 loss, zero-one loss and Hamming loss. Additionally, we investigated relations between classification quality, which is expressed in terms of different quality criteria, and characteristics of multi-label datasets such as average imbalance ratio or label density. The experimental study reveals that the correction approaches significantly outperforms the reference method only in terms of zero-one loss.


Scalable Exact Parent Sets Identification in Bayesian Networks Learning with Apache Spark

arXiv.org Artificial Intelligence

In Machine Learning, the parent set identification problem is to find a set of random variables that best explain selected variable given the data and some predefined scoring function. This problem is a critical component to structure learning of Bayesian networks and Markov blankets discovery, and thus has many practical applications, ranging from fraud detection to clinical decision support. In this paper, we introduce a new distributed memory approach to the exact parent sets assignment problem. To achieve scalability, we derive theoretical bounds to constraint the search space when MDL scoring function is used, and we reorganize the underlying dynamic programming such that the computational density is increased and fine-grain synchronization is eliminated. We then design efficient realization of our approach in the Apache Spark platform. Through experimental results, we demonstrate that the method maintains strong scalability on a 500-core standalone Spark cluster, and it can be used to efficiently process data sets with 70 variables, far beyond the reach of the currently available solutions.


An Expectation Maximization Framework for Preferential Attachment Models

arXiv.org Machine Learning

In this paper we develop an Expectation Maximization(EM) algorithm to estimate the parameter of a Yule-Simon distribution. The Yule-Simon distribution exhibits the "rich get richer" effect whereby an 80-20 type of rule tends to dominate. These distributions are ubiquitous in industrial settings. The EM algorithm presented provides both frequentist and Bayesian estimates of the $\lambda$ parameter. By placing the estimation method within the EM framework we are able to derive Standard errors of the resulting estimate. Additionally, we prove convergence of the Yule-Simon EM algorithm and study the rate of convergence. An explicit, closed form solution for the rate of convergence of the algorithm is given.


AI - The present in the making

#artificialintelligence

For many people, the concept of Artificial Intelligence (AI) is a thing of the future. It is the technology that has yet to be introduced. But Professor Jon Oberlander disagrees. He was quick to point out that AI is not in the future, it is now in the making. He began by mentioning Alexa, Amazon's star product. It is an artificial intelligent personal assistant, which was made popular by Amazon Echo devices.