Goto

Collaborating Authors

 Uncertainty


Efficient variational inference in large-scale Bayesian compressed sensing

arXiv.org Machine Learning

We study linear models under heavy-tailed priors from a probabilistic viewpoint. Instead of computing a single sparse most probable (MAP) solution as in standard deterministic approaches, the focus in the Bayesian compressed sensing framework shifts towards capturing the full posterior distribution on the latent variables, which allows quantifying the estimation uncertainty and learning model parameters using maximum likelihood. The exact posterior distribution under the sparse linear model is intractable and we concentrate on variational Bayesian techniques to approximate it. Repeatedly computing Gaussian variances turns out to be a key requisite and constitutes the main computational bottleneck in applying variational techniques in large-scale problems. We leverage on the recently proposed Perturb-and-MAP algorithm for drawing exact samples from Gaussian Markov random fields (GMRF). The main technical contribution of our paper is to show that estimating Gaussian variances using a relatively small number of such efficiently drawn random samples is much more effective than alternative general-purpose variance estimation techniques. By reducing the problem of variance estimation to standard optimization primitives, the resulting variational algorithms are fully scalable and parallelizable, allowing Bayesian computations in extremely large-scale problems with the same memory and time complexity requirements as conventional point estimation techniques. We illustrate these ideas with experiments in image deblurring.


Bayesian nonparametric multivariate convex regression

arXiv.org Machine Learning

X, where f(x) is the gradient of f at x. This is called the convex regression problem. Convex regression can easily be modified to allow concave regression by multiplying all of the values by negative one. Convex regression problems are common in economics, operations research and reinforcement learning. In economics, production functions (Skiba 1978) and consumer preferences (Meyer & Pratt 1968) are often convex, while in operations research and reinforcement learning, value functions for stochastic optimization problems can be convex (Shapiro et al. 2009). If a problem is known to be convex, a convex regression estimate provides advantages over an unrestricted estimate. First, convexity is a powerful regularizer: it places strong conditions on the derivatives--and hence smoothness--of a function. Convexity constraints can substantially reduce overfitting and lead to more accurate predictions. Second, maintaining convexity allows the use of convex optimization solvers when the regression estimate is used in an objective function of an optimization problem. 1 Multivariate convex regression has received relatively little attention in the literature.


A Probabilistic Framework for Learning Kinematic Models of Articulated Objects

Journal of Artificial Intelligence Research

Robots operating in domestic environments generally need to interact with articulated objects, such as doors, cabinets, dishwashers or fridges. In this work, we present a novel, probabilistic framework for modeling articulated objects as kinematic graphs. Vertices in this graph correspond to object parts, while edges between them model their kinematic relationship. In particular, we present a set of parametric and non-parametric edge models and how they can robustly be estimated from noisy pose observations. We furthermore describe how to estimate the kinematic structure and how to use the learned kinematic models for pose prediction and for robotic manipulation tasks. We finally present how the learned models can be generalized to new and previously unseen objects. In various experiments using real robots with different camera systems as well as in simulation, we show that our approach is valid, accurate and efficient. Further, we demonstrate that our approach has a broad set of applications, in particular for the emerging fields of mobile manipulation and service robotics.


Lifted Graphical Models: A Survey

arXiv.org Artificial Intelligence

This article presents a survey of work on lifted graphical models. We review a general form for a lifted graphical model, a par-factor graph, and show how a number of existing statistical relational representations map to this formalism. We discuss inference algorithms, including lifted inference algorithms, that efficiently compute the answers to probabilistic queries. We also review work in learning lifted graphical models from data. It is our belief that the need for statistical relational models (whether it goes by that name or another) will grow in the coming decades, as we are inundated with data which is a mix of structured and unstructured, with entities and relations extracted in a noisy manner from text, and with the need to reason effectively with this data. We hope that this synthesis of ideas from many different research groups will provide an accessible starting point for new researchers in this expanding field.


Strong Solutions of the Fuzzy Linear Systems

arXiv.org Artificial Intelligence

We consider a fuzzy linear system with crisp coefficient matrix and with an arbitrary fuzzy number in parametric form on the right-hand side. It is known that the well-known existence and uniqueness theorem of a strong fuzzy solution is equivalent to the following: The coefficient matrix is the product of a permutation matrix and a diagonal matrix. This means that this theorem can be applicable only for a special form of linear systems, namely, only when the system consists of equations, each of which has exactly one variable. We prove an existence and uniqueness theorem, which can be use on more general systems. The necessary and sufficient conditions of the theorem are dependent on both the coefficient matrix and the right-hand side. This theorem is a generalization of the well-known existence and uniqueness theorem for the strong solution.


A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition

arXiv.org Machine Learning

Multi-step ahead forecasting is still an open challenge in time series forecasting. Several approaches that deal with this complex problem have been proposed in the literature but an extensive comparison on a large number of tasks is still missing. This paper aims to fill this gap by reviewing existing strategies for multi-step ahead forecasting and comparing them in theoretical and practical terms. To attain such an objective, we performed a large scale comparison of these different strategies using a large experimental benchmark (namely the 111 series from the NN5 forecasting competition). In addition, we considered the effects of deseasonalization, input variable selection, and forecast combination on these strategies and on multi-step ahead forecasting at large. The following three findings appear to be consistently supported by the experimental results: Multiple-Output strategies are the best performing approaches, deseasonalization leads to uniformly improved forecast accuracy, and input selection is more effective when performed in conjunction with deseasonalization.


Overlapping Mixtures of Gaussian Processes for the Data Association Problem

arXiv.org Machine Learning

In this work we introduce a mixture of GPs to address the data association problem, i.e. to label a group of observations according to the sources that generated them. Unlike several previously proposed GP mixtures, the novel mixture has the distinct characteristic of using no gating function to determine the association of samples and mixture components. Instead, all the GPs in the mixture are global and samples are clustered following "trajectories" across input space. We use a nonstandard variational Bayesian algorithm to efficiently recover sample labels and learn the hyperparameters. We show how multi-object tracking problems can be disambiguated and also explore the characteristics of the model in traditional regression settings. Keywords: Gaussian Processes, Marginalized Variational Inference, Bayesian Models 1. Introduction The data association problem arises in multi-target tracking scenarios.


Sparse Signal Recovery with Temporally Correlated Source Vectors Using Sparse Bayesian Learning

arXiv.org Machine Learning

We address the sparse signal recovery problem in the context of multiple measurement vectors (MMV) when elements in each nonzero row of the solution matrix are temporally correlated. Existing algorithms do not consider such temporal correlations and thus their performance degrades significantly with the correlations. In this work, we propose a block sparse Bayesian learning framework which models the temporal correlations. In this framework we derive two sparse Bayesian learning (SBL) algorithms, which have superior recovery performance compared to existing algorithms, especially in the presence of high temporal correlations. Furthermore, our algorithms are better at handling highly underdetermined problems and require less row-sparsity on the solution matrix. We also provide analysis of the global and local minima of their cost function, and show that the SBL cost function has the very desirable property that the global minimum is at the sparsest solution to the MMV problem. Extensive experiments also provide some interesting results that motivate future theoretical research on the MMV model.


A sticky HDP-HMM with application to speaker diarization

arXiv.org Machine Learning

We consider the problem of speaker diarization, the problem of segmenting an audio recording of a meeting into temporal segments corresponding to individual speakers. The problem is rendered particularly difficult by the fact that we are not allowed to assume knowledge of the number of people participating in the meeting. To address this problem, we take a Bayesian nonparametric approach to speaker diarization that builds on the hierarchical Dirichlet process hidden Markov model (HDP-HMM) of Teh et al. [J. Amer. Statist. Assoc. 101 (2006) 1566--1581]. Although the basic HDP-HMM tends to over-segment the audio data---creating redundant states and rapidly switching among them---we describe an augmented HDP-HMM that provides effective control over the switching rate. We also show that this augmentation makes it possible to treat emission distributions nonparametrically. To scale the resulting architecture to realistic diarization problems, we develop a sampling algorithm that employs a truncated approximation of the Dirichlet process to jointly resample the full state sequence, greatly improving mixing rates. Working with a benchmark NIST data set, we show that our Bayesian nonparametric architecture yields state-of-the-art speaker diarization results.


A Kernel Approach to Tractable Bayesian Nonparametrics

arXiv.org Machine Learning

Inference in popular nonparametric Bayesian models typically relies on sampling or other approximations. This paper presents a general methodology for constructing novel tractable nonparametric Bayesian methods by applying the kernel trick to inference in a parametric Bayesian model. For example, Gaussian process regression can be derived this way from Bayesian linear regression. Despite the success of the Gaussian process framework, the kernel trick is rarely explicitly considered in the Bayesian literature. In this paper, we aim to fill this gap and demonstrate the potential of applying the kernel trick to tractable Bayesian parametric models in a wider context than just regression. As an example, we present an intuitive Bayesian kernel machine for density estimation that is obtained by applying the kernel trick to a Gaussian generative model in feature space.