Goto

Collaborating Authors

 Bayesian Inference


Heteroscedastic Treed Bayesian Optimisation

arXiv.org Machine Learning

Optimising black-box functions is important in many disciplines, such as tuning machine learning models, robotics, finance and mining exploration. Bayesian optimisation is a state-of-the-art technique for the global optimisation of black-box functions which are expensive to evaluate. At the core of this approach is a Gaussian process prior that captures our belief about the distribution over functions. However, in many cases a single Gaussian process is not flexible enough to capture non-stationarity in the objective function. Consequently, heteroscedasticity negatively affects performance of traditional Bayesian methods. In this paper, we propose a novel prior model with hierarchical parameter learning that tackles the problem of non-stationarity in Bayesian optimisation. Our results demonstrate substantial improvements in a wide range of applications, including automatic machine learning and mining exploration.


Optimality of Poisson processes intensity learning with Gaussian processes

arXiv.org Machine Learning

In this paper we provide theoretical support for the so-called "Sigmoidal Gaussian Cox Process" approach to learning the intensity of an inhomogeneous Poisson process on a $d$-dimensional domain. This method was proposed by Adams, Murray and MacKay (ICML, 2009), who developed a tractable computational approach and showed in simulation and real data experiments that it can work quite satisfactorily. The results presented in the present paper provide theoretical underpinning of the method. In particular, we show how to tune the priors on the hyper parameters of the model in order for the procedure to automatically adapt to the degree of smoothness of the unknown intensity and to achieve optimal convergence rates.


Modeling Ecological Integrity with Bayesian Belief Networks

AAAI Conferences

Although the concept of ecological integrity is referred in many country legislations there is no consensus on how to formalize and implement it. One possible definition is as the capacity of an ecosystem to support and maintain a balanced, integrated, and adaptive community of organisms having a species composition, diversity, and functional organization comparable to that of a natural habitat of the region. Our objective is to model this interpretation of ecological integrity from a set of ecological measures that can be estimated from ecological inventory data.


Nonparametric Bayesian Learning of Other Agents' Policies in Interactive POMDPs

AAAI Conferences

We consider an autonomous agent facing a partially observable, stochastic, multiagent environment where the unknown policies of other agents are represented as finite state controllers (FSCs). We show how an agent can (i) learn the FSCs of the other agents, and (ii) exploit these models during interactions. To separate the issues of off-line versus on-line learning we consider here an off-line two-phase approach. During the first phase the agent observes as the other player(s) are interacting with the environment (the observations may be imperfect and the learning agent is not taking part in the interaction.) The collected data is used to learn an ensemble of FSCs that explain the behavior of the other agent(s) using a Bayesian non-parametric (BNP) approach. We verify the quality of the learned models during the second phase by allowing the agent to compute its own optimal policy and interact with the observed agent. The optimal policy for the learning agent is obtained by solving an interactive POMDP in which the states are augmented by the other agent(s)' possible FSCs. The advantage of using the Bayesian nonparametric approach in the first phase is that the complexity (number of nodes) of the learned controllers is not bounded a priori. Our two-phase approach is preliminary and separates the learning using BNP from the complexities of learning on-line while the other agent may be modifying its policy (on-line approach is subject of our future work.) We describe our implementation and results in a multiagent Tiger domain. Our results show that learning improves the agent's performance, which increases with the amount of data collected during the learning phase.


Covariance Matrices and Influence Scores for Mean Field Variational Bayes

arXiv.org Machine Learning

Mean field variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is that it underestimates the uncertainty of model variables (sometimes severely) and provides no information about model variable covariance. We develop a fast, general methodology for exponential families that augments MFVB to deliver accurate uncertainty estimates for model variables -- both for individual variables and coherently across variables. MFVB for exponential families defines a fixed-point equation in the means of the approximating posterior, and our approach yields a covariance estimate by perturbing this fixed point. Inspired by linear response theory, we call our method linear response variational Bayes (LRVB). We also show how LRVB can be used to quickly calculate a measure of the influence of individual data points on parameter point estimates. We demonstrate the accuracy and scalability of our method by learning Gaussian mixture models for both simulated and real data.


Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions

arXiv.org Machine Learning

Mixture modelling involves explaining some observed evidence using a combination of probability distributions. The crux of the problem is the inference of an optimal number of mixture components and their corresponding parameters. This paper discusses unsupervised learning of mixture models using the Bayesian Minimum Message Length (MML) criterion. To demonstrate the effectiveness of search and inference of mixture parameters using the proposed approach, we select two key probability distributions, each handling fundamentally different types of data: the multivariate Gaussian distribution to address mixture modelling of data distributed in Euclidean space, and the multivariate von Mises-Fisher (vMF) distribution to address mixture modelling of directional data distributed on a unit hypersphere. The key contributions of this paper, in addition to the general search and inference methodology, include the derivation of MML expressions for encoding the data using multivariate Gaussian and von Mises-Fisher distributions, and the analytical derivation of the MML estimates of the parameters of the two distributions. Our approach is tested on simulated and real world data sets. For instance, we infer vMF mixtures that concisely explain experimentally determined three-dimensional protein conformations, providing an effective null model description of protein structures that is central to many inference problems in structural bioinformatics. The experimental results demonstrate that the performance of our proposed search and inference method along with the encoding schemes improve on the state of the art mixture modelling techniques.


1-Bit Matrix Completion under Exact Low-Rank Constraint

arXiv.org Machine Learning

We consider the problem of noisy 1-bit matrix completion under an exact rank constraint on the true underlying matrix $M^*$. Instead of observing a subset of the noisy continuous-valued entries of a matrix $M^*$, we observe a subset of noisy 1-bit (or binary) measurements generated according to a probabilistic model. We consider constrained maximum likelihood estimation of $M^*$, under a constraint on the entry-wise infinity-norm of $M^*$ and an exact rank constraint. This is in contrast to previous work which has used convex relaxations for the rank. We provide an upper bound on the matrix estimation error under this model. Compared to the existing results, our bound has faster convergence rate with matrix dimensions when the fraction of revealed 1-bit observations is fixed, independent of the matrix dimensions. We also propose an iterative algorithm for solving our nonconvex optimization with a certificate of global optimality of the limiting point. This algorithm is based on low rank factorization of $M^*$. We validate the method on synthetic and real data with improved performance over existing methods.


Classification approach based on association rules mining for unbalanced data

arXiv.org Machine Learning

This paper deals with the binary classification task when the target class has the lower probability of occurrence. In such situation, it is not possible to build a powerful classifier by using standard methods such as logistic regression, classification tree, discriminant analysis, etc. To overcome this short-coming of these methods which yield classifiers with low sensibility, we tackled the classification problem here through an approach based on the association rules learning. This approach has the advantage of allowing the identification of the patterns that are well correlated with the target class. Association rules learning is a well known method in the area of data-mining. It is used when dealing with large database for unsupervised discovery of local patterns that expresses hidden relationships between input variables. In considering association rules from a supervised learning point of view, a relevant set of weak classifiers is obtained from which one derives a classifier that performs well.


Generative Deep Deconvolutional Learning

arXiv.org Machine Learning

A generative Bayesian model is developed for deep (multi-layer) convolutional dictionary learning. A novel probabilistic pooling operation is integrated into the deep model, yielding efficient bottom-up and top-down probabilistic learning. After learning the deep convolutional dictionary, testing is implemented via deconvolutional inference. To speed up this inference, a new statistical approach is proposed to project the top-layer dictionary elements to the data level. Following this, only one layer of deconvolution is required during testing. Experimental results demonstrate powerful capabilities of the model to learn multi-layer features from images. Excellent classification results are obtained on both the MNIST and Caltech 101 datasets.


ICR: Iterative Convex Refinement for Sparse Signal Recovery Using Spike and Slab Priors

arXiv.org Machine Learning

In this letter, we address sparse signal recovery using spike and slab priors. In particular, we focus on a Bayesian framework where sparsity is enforced on reconstruction coefficients via probabilistic priors. The optimization resulting from spike and slab prior maximization is known to be a hard non-convex problem, and existing solutions involve simplifying assumptions and/or relaxations. We propose an approach called Iterative Convex Refinement (ICR) that aims to solve the aforementioned optimization problem directly allowing for greater generality in the sparse structure. Essentially, ICR solves a sequence of convex optimization problems such that sequence of solutions converges to a sub-optimal solution of the original hard optimization problem. We propose two versions of our algorithm: a.) an unconstrained version, and b.) with a non-negativity constraint on sparse coefficients, which may be required in some real-world problems. Experimental validation is performed on both synthetic data and for a real-world image recovery problem, which illustrates merits of ICR over state of the art alternatives.