Goto

Collaborating Authors

 Directed Networks


Degrees of Freedom and Model Selection for kmeans Clustering

arXiv.org Machine Learning

This paper investigates the problem of model selection for kmeans clustering, based on conservative estimates of the model degrees of freedom. An extension of Stein's lemma, which is used in unbiased risk estimation, is used to obtain an expression which allows one to approximate the degrees of freedom. Empirically based estimates of this approximation are obtained. The degrees of freedom estimates are then used within the popular Bayesian Information Criterion to perform model selection. The proposed estimation procedure is validated in a thorough simulation study, and the robustness is assessed through relaxations of the modelling assumptions and on data from real applications. Comparisons with popular existing techniques suggest that this approach performs extremely well when the modelling assumptions


Evidential Deep Learning to Quantify Classification Uncertainty

arXiv.org Machine Learning

Deterministic neural nets have been shown to learn effective predictors on a wide range of machine learning problems. However, as the standard approach is to train the network to minimize a prediction loss, the resultant model remains ignorant to its prediction confidence. Orthogonally to Bayesian neural nets that indirectly infer prediction uncertainty through weight uncertainties, we propose explicit modeling of the same using the theory of subjective logic. By placing a Dirichlet prior on the softmax output, we treat predictions of a neural net as subjective opinions and learn the function that collects the evidence leading to these opinions by a deterministic neural net from data. The resultant predictor for a multi-class classification problem is another Dirichlet distribution whose parameters are set by the continuous output of a neural net. We provide a preliminary analysis on how the peculiarities of our new loss function drive improved uncertainty estimation. We observe that our method achieves unprecedented success on detection of out-of-sample queries and endurance against adversarial perturbations.


Cycle-Consistent Adversarial Learning as Approximate Bayesian Inference

arXiv.org Machine Learning

We formalize the problem of learning interdomain correspondences in the absence of paired data as Bayesian inference in a latent variable model (LVM), where one seeks the underlying hidden representations of entities from one domain as entities from the other domain. First, we introduce implicit latent variable models, where the prior over hidden representations can be specified flexibly as an implicit distribution. Next, we develop a new variational inference (VI) algorithm for this model based on minimization of the symmetric Kullback-Leibler (KL) divergence between a variational joint and the exact joint distribution. Lastly, we demonstrate that the state-of-the-art cycle-consistent adversarial learning (CYCLEGAN) models can be derived as a special case within our proposed VI framework, thus establishing its connection to approximate Bayesian inference methods.


Semiparametric Classification of Forest Graphical Models

arXiv.org Machine Learning

We propose a new semiparametric approach to binary classification that exploits the modeling flexibility of sparse graphical models. Specifically, we assume that each class can be represented by a forest-structured graphical model. Under this assumption, the optimal classifier is linear in the log of the one- and two-dimensional marginal densities. Our proposed procedure non-parametrically estimates the univariate and bivariate marginal densities, maps each sample to the logarithm of these estimated densities and constructs a linear SVM in the transformed space. We prove convergence of the resulting classifier to an oracle SVM classifier and give finite sample bounds on its excess risk. Experiments with simulated and real data indicate that the resulting classifier is competitive with several popular methods across a range of applications.


MRPC: An R package for accurate inference of causal graphs

arXiv.org Machine Learning

We present MRPC, an R package that learns causal graphs with improved accuracy over existing packages, such as pcalg and bnlearn. Our algorithm builds on the powerful PC algorithm, the canonical algorithm in computer science for learning directed acyclic graphs. The improvement in accuracy results from online control of the false discovery rate (FDR) that reduces false positive edges, a more accurate approach to identifying v-structures (i.e., $T_1 \rightarrow T_2 \leftarrow T_3$), and robust estimation of the correlation matrix among nodes. For genomic data that contain genotypes and gene expression for each sample, MRPC incorporates the principle of Mendelian randomization to orient the edges. Our package can be applied to continuous and discrete data.


Deep Gaussian Processes with Convolutional Kernels

arXiv.org Machine Learning

Deep Gaussian processes (DGPs) provide a Bayesian non-parametric alternative to standard parametric deep learning models. A DGP is formed by stacking multiple GPs resulting in a well-regularized composition of functions. The Bayesian framework that equips the model with attractive properties, such as implicit capacity control and predictive uncertainty, makes it at the same time challenging to combine with a convolutional structure. This has hindered the application of DGPs in computer vision tasks, an area where deep parametric models (i.e. CNNs) have made breakthroughs. Standard kernels used in DGPs such as radial basis functions (RBFs) are insufficient for handling pixel variability in raw images. In this paper, we build on the recent convolutional GP to develop Convolutional DGP (CDGP) models which effectively capture image level features through the use of convolution kernels, therefore opening up the way for applying DGPs to computer vision tasks. Our model learns local spatial influence and outperforms strong GP based baselines on multi-class image classification. We also consider various constructions of convolution kernel over the image patches, analyze the computational trade-offs and provide an efficient framework for convolutional DGP models. The experimental results on image data such as MNIST, rectangles-image, CIFAR10 and Caltech101 demonstrate the effectiveness of the proposed approaches.


A Primer on Causal Analysis

arXiv.org Machine Learning

We provide a conceptual map to navigate causal analysis problems. Focusing on the case of discrete random variables, we consider the case of causal effect estimation from observational data. The presented approaches apply also to continuous variables, but the issue of estimation becomes more complex. We then introduce the four schools of thought for causal analysis


Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study

arXiv.org Machine Learning

We prove that idealised discriminative Bayesian neural networks, capturing perfect epistemic uncertainty, cannot have adversarial examples: Techniques for crafting adversarial examples will necessarily fail to generate perturbed images which fool the classifier. This suggests why MC dropout-based techniques have been observed to be fairly effective against adversarial examples. We support our claims mathematically and empirically. We experiment with HMC on synthetic data derived from MNIST for which we know the ground truth image density, showing that near-perfect epistemic uncertainty correlates to density under image manifold, and that adversarial images lie off the manifold. Using our new-found insights we suggest a new attack for MC dropout-based models by looking for imperfections in uncertainty estimation, and also suggest a mitigation. Lastly, we demonstrate our mitigation on a cats-vs-dogs image classification task with a VGG13 variant.


How Bayesian Networks are pioneering the 'smart data' revolution

#artificialintelligence

The era of'big data' offers enormous opportunities for societal improvements. There is an expectation โ€“ and even excitement โ€“ that, by simply applying sophisticated machine learning algorithms to'big data' sets, we may automatically find solutions to problems that were previously either unsolvable or would incur prohibitive economic costs. Yet, the clever algorithms needed to process big data cannot (and will never) solve most of the critical risk analysis problems that we face. Big data, even when carefully collected is typically unstructured and noisy; even the'biggest data' typically lack crucial, often hidden, information about key causal or explanatory variables that generate or influence the data we observe. For example, the world's leading economists failed to predict the 2008โ€“2010 international financial crisis because they relied on models based on historical statistical data that could not adapt to new circumstances, even when those circumstances were foreseeable by contrarian experts.


Variable Selection Methods for Model-based Clustering

arXiv.org Machine Learning

Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to deal with the increasing dimensionality. In particular, the development of variable selection techniques has received a lot of attention and research effort in recent years. Even for small size problems, variable selection has been advocated to facilitate the interpretation of the clustering results. This review provides a summary of the methods developed for variable selection in model-based clustering. Existing R packages implementing the different methods are indicated and illustrated in application to two data analysis examples.