Uncertainty
Fuzzy logic and online translations
In the IT world it is very important to get things exactly right, unless of course you are dealing with fuzzy logic. If your banking system, for example, is out by a decimal place, this can lead to some very unhappy customers or some very happy ones and a less than happy bank. There are many areas where calculations and units of measure are vital, just ask one of the Mars exploration landing teams. Not so obvious are making sure that things like translation services are accurate, as Microsoft found out recently. The Bing search engine saw some very red-faced Redmondites when Saudi Arabian users found that "Daesh" has been translated as "Saudi Arabia". For those not familiar with the term, Daesh is one name for Isis.
A Probabilistic Modeling Approach to Hearing Loss Compensation
van de Laar, Thijs, de Vries, Bert
Hearing Aid (HA) algorithms need to be tuned ("fitted") to match the impairment of each specific patient. The lack of a fundamental HA fitting theory is a strong contributing factor to an unsatisfying sound experience for about 20% of hearing aid patients. This paper proposes a probabilistic modeling approach to the design of HA algorithms. The proposed method relies on a generative probabilistic model for the hearing loss problem and provides for automated inference of the corresponding (1) signal processing algorithm, (2) the fitting solution as well as a principled (3) performance evaluation metric. All three tasks are realized as message passing algorithms in a factor graph representation of the generative model, which in principle allows for fast implementation on hearing aid or mobile device hardware. The methods are theoretically worked out and simulated with a custom-built factor graph toolbox for a specific hearing loss model.
A General Method for Robust Bayesian Modeling
Robust Bayesian models are appealing alternatives to standard models, providing protection from data that contains outliers or other departures from the model assumptions. Historically, robust models were mostly developed on a case-by-case basis; examples include robust linear regression, robust mixture models, and bursty topic models. In this paper we develop a general approach to robust Bayesian modeling. We show how to turn an existing Bayesian model into a robust model, and then develop a generic strategy for computing with it. We use our method to study robust variants of several models, including linear regression, Poisson regression, logistic regression, and probabilistic topic models. We discuss the connections between our methods and existing approaches, especially empirical Bayes and James-Stein estimation.
Variational Gaussian Process Auto-Encoder for Ordinal Prediction of Facial Action Units
Eleftheriadis, Stefanos, Rudovic, Ognjen, Deisenroth, Marc P., Pantic, Maja
We address the task of simultaneous feature fusion and modeling of discrete ordinal outputs. We propose a novel Gaussian process(GP) auto-encoder modeling approach. In particular, we introduce GP encoders to project multiple observed features onto a latent space, while GP decoders are responsible for reconstructing the original features. Inference is performed in a novel variational framework, where the recovered latent representations are further constrained by the ordinal output labels. In this way, we seamlessly integrate the ordinal structure in the learned manifold, while attaining robust fusion of the input features. We demonstrate the representation abilities of our model on benchmark datasets from machine learning and affect analysis. We further evaluate the model on the tasks of feature fusion and joint ordinal prediction of facial action units. Our experiments demonstrate the benefits of the proposed approach compared to the state of the art.
A General Framework for Constrained Bayesian Optimization using Information-based Search
Hernรกndez-Lobato, Josรฉ Miguel, Gelbart, Michael A., Adams, Ryan P., Hoffman, Matthew W., Ghahramani, Zoubin
We present an information-theoretic framework for solving global black-box optimization problems that also have black-box constraints. Of particular interest to us is to efficiently solve problems with decoupled constraints, in which subsets of the objective and constraint functions may be evaluated independently. For example, when the objective is evaluated on a CPU and the constraints are evaluated independently on a GPU. These problems require an acquisition function that can be separated into the contributions of the individual function evaluations. We develop one such acquisition function and call it Predictive Entropy Search with Constraints (PESC). PESC is an approximation to the expected information gain criterion and it compares favorably to alternative approaches based on improvement in several synthetic and real-world problems. In addition to this, we consider problems with a mix of functions that are fast and slow to evaluate. These problems require balancing the amount of time spent in the meta-computation of PESC and in the actual evaluation of the target objective. We take a bounded rationality approach and develop partial update for PESC which trades off accuracy against speed. We then propose a method for adaptively switching between the partial and full updates for PESC. This allows us to interpolate between versions of PESC that are efficient in terms of function evaluations and those that are efficient in terms of wall-clock time. Overall, we demonstrate that PESC is an effective algorithm that provides a promising direction towards a unified solution for constrained Bayesian optimization.
Generic Inference in Latent Gaussian Process Models
Bonilla, Edwin V., Krauth, Karl, Dezfouli, Amir
We develop an automated variational method for inference in models with Gaussian process (GP) priors and general likelihoods. The method supports multiple outputs and multiple latent functions and does not require detailed knowledge of the conditional likelihood, only needing its evaluation as a black-box function. Using a mixture of Gaussians as the variational distribution, we show that the evidence lower bound and its gradients can be estimated efficiently using empirical expectations over univariate Gaussian distributions. Furthermore, the method is scalable to large datasets which is achieved by using an augmented prior via the inducing-variable approach underpinning most sparse GP approximations, along with parallel computation and stochastic optimization. We evaluate our method with experiments on small datasets, medium-scale datasets and a large dataset, showing its competitiveness under different likelihood models and sparsity levels. Moreover, we analyze learning in our model under batch and stochastic settings, and study the effect of optimizing the inducing inputs. Finally, in the large-scale experiment, we investigate the problem of predicting airline delays and show that our method is on par with the state-of-the-art hard-coded approach for scalable GP regression.
Least Ambiguous Set-Valued Classifiers with Bounded Error Levels
Sadinle, Mauricio, Lei, Jing, Wasserman, Larry
In most classification tasks there are observations that are ambiguous and therefore difficult to correctly label. Set-valued classification allows the classifiers to output a set of plausible labels rather than a single label, thereby giving a more appropriate and informative treatment to the labeling of ambiguous instances. We introduce a framework for multiclass set-valued classification, where the classifiers guarantee user-defined levels of coverage or confidence (the probability that the true label is contained in the set) while minimizing the ambiguity (the expected size of the output). We first derive oracle classifiers assuming the true distribution to be known. We show that the oracle classifiers are obtained from level sets of the functions that define the conditional probability of each class. Then we develop estimators with good asymptotic and finite sample properties. The proposed classifiers build on and refine many existing single-label classifiers. The optimal classifier can sometimes output the empty set. We provide two solutions to fix this issue that are suitable for various practical needs.
The Bayesian SLOPE
The SLOPE estimates regression coefficients by minimizing a regularized residual sum of squares using a sorted-$\ell_1$-norm penalty. The SLOPE combines testing and estimation in regression problems. It exhibits suitable variable selection and prediction properties, as well as minimax optimality. This paper introduces the Bayesian SLOPE procedure for linear regression. The classical SLOPE estimate is the posterior mode in the normal regression problem with an appropriate prior on the coefficients. The Bayesian SLOPE considers the full Bayesian model and has the advantage of offering credible sets and standard error estimates for the parameters. Moreover, the hierarchical Bayesian framework allows for full Bayesian and empirical Bayes treatment of the penalty coefficients; whereas it is not clear how to choose these coefficients when using the SLOPE on a general design matrix. A direct characterization of the posterior is provided which suggests a Gibbs sampler that does not involve latent variables. An efficient hybrid Gibbs sampler for the Bayesian SLOPE is introduced. Point estimation using the posterior mean is highlighted, which automatically facilitates the Bayesian prediction of future observations. These are demonstrated on real and synthetic data.
Qualitative Spatial Logics for Buffered Geometries
This paper describes a series of new qualitative spatial logics for checking consistency of sameAs and partOf matches between spatial objects from different geospatial datasets, especially from crowd-sourced datasets. Since geometries in crowd-sourced data are usually not very accurate or precise, we buffer geometries by a margin of error or a level of tolerance, and define spatial relations for buffered geometries. The spatial logics formalize the notions of `buffered equal' (intuitively corresponding to `possibly sameAs'), `buffered part of' (`possibly partOf'), `near' (`possibly connected') and `far' (`definitely disconnected'). A sound and complete axiomatisation of each logic is provided with respect to models based on metric spaces. For each of the logics, the satisfiability problem is shown to be NP-complete. Finally, we briefly describe how the logics are used in a system for generating and debugging matches between spatial objects, and report positive experimental evaluation results for the system.
Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework
Scarlett, Jonathan, Cevher, Volkan
The support recovery problem consists of determining a sparse subset of a set of variables that is relevant in generating a set of observations, and arises in a diverse range of settings such as compressive sensing, and subset selection in regression, and group testing. In this paper, we take a unified approach to support recovery problems, considering general probabilistic models relating a sparse data vector to an observation vector. We study the information-theoretic limits of both exact and partial support recovery, taking a novel approach motivated by thresholding techniques in channel coding. We provide general achievability and converse bounds characterizing the trade-off between the error probability and number of measurements, and we specialize these to the linear, 1-bit, and group testing models. In several cases, our bounds not only provide matching scaling laws in the necessary and sufficient number of measurements, but also sharp thresholds with matching constant factors. Our approach has several advantages over previous approaches: For the achievability part, we obtain sharp thresholds under broader scalings of the sparsity level and other parameters (e.g., signal-to-noise ratio) compared to several previous works, and for the converse part, we not only provide conditions under which the error probability fails to vanish, but also conditions under which it tends to one.