Uncertainty
The Variational Gaussian Process
Tran, Dustin, Ranganath, Rajesh, Blei, David M.
Variational inference is a powerful tool for approximate inference, and it has been recently applied for representation learning with deep generative models. We develop the variational Gaussian process (VGP), a Bayesian nonparametric variational family, which adapts its shape to match complex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs and warping them through random non-linear mappings; the distribution over random mappings is learned during inference, enabling the transformed outputs to adapt to varying complexity. We prove a universal approximation theorem for the VGP, demonstrating its representative power for learning any model. For inference we present a variational objective inspired by auto-encoders and perform black box inference over a wide class of models. The VGP achieves new state-of-the-art results for unsupervised learning, inferring models such as the deep latent Gaussian model and the recently proposed DRAW.
A short note on extension theorems and their connection to universal consistency in machine learning
Christmann, Andreas, Dumpert, Florian, Xiang, Dao-Hong
Statistical machine learning plays an important role in modern statistics and computer science. One main goal of statistical machine learning is to provide universally consistent algorithms, i.e., the estimator converges in probability or in some stronger sense to the Bayes risk or to the Bayes decision function. Kernel methods based on minimizing the regularized risk over a reproducing kernel Hilbert space (RKHS) belong to these statistical machine learning methods. It is in general unknown which kernel yields optimal results for a particular data set or for the unknown probability measure. Hence various kernel learning methods were proposed to choose the kernel and therefore also its RKHS in a data adaptive manner. Nevertheless, many practitioners often use the classical Gaussian RBF kernel or certain Sobolev kernels with good success. The goal of this short note is to offer one possible theoretical explanation for this empirical fact.
Computationally Efficient Bayesian Learning of Gaussian Process State Space Models
Svensson, Andreas, Solin, Arno, Sรคrkkรค, Simo, Schรถn, Thomas B.
Gaussian processes allow for flexible specification of prior assumptions of unknown dynamics in state space models. We present a procedure for efficient Bayesian learning in Gaussian process state space models, where the representation is formed by projecting the problem onto a set of approximate eigenfunctions derived from the prior covariance structure. Learning under this family of models can be conducted using a carefully crafted particle MCMC algorithm. This scheme is computationally efficient and yet allows for a fully Bayesian treatment of the problem. Compared to conventional system identification tools or existing learning methods, we show competitive performance and reliable quantification of uncertainties in the model.
in memory of David MacKay FRS, a remarkable post of how a pioneer in Machine Learning dealt with his illness โข /r/MachineLearning
I was a physicist with an almost nil education in bayesian inference and machine learning, and his book was the first contact I had with those things. It was the first book on this topic that I read cover-to-cover when I decided to my a phd in Bayesian models instead of Semiconductor Physics (my M.Sc. It was so recently that I congratulated him on twitter on him being knighted by the Queen, and I was so happy that he responded. I never knew him, only through his book.
1-bit Matrix Completion: PAC-Bayesian Analysis of a Variational Approximation
Cottet, Vincent, Alquier, Pierre
Due to challenging applications such as collaborative filtering, the matrix completion problem has been widely studied in the past few years. Different approaches rely on different structure assumptions on the matrix in hand. Here, we focus on the completion of a (possibly) low-rank matrix with binary entries, the so-called 1-bit matrix completion problem. Our approach relies on tools from machine learning theory: empirical risk minimization and its convex relaxations. We propose an algorithm to compute a variational approximation of the pseudo-posterior. Thanks to the convex relaxation, the corresponding minimization problem is bi-convex, and thus the method behaves well in practice. We also study the performance of this variational approximation through PAC-Bayesian learning bounds. On the contrary to previous works that focused on upper bounds on the estimation error of M with various matrix norms, we are able to derive from this analysis a PAC bound on the prediction error of our algorithm. We focus essentially on convex relaxation through the hinge loss, for which we present the complete analysis, a complete simulation study and a test on the MovieLens data set. However, we also discuss a variational approximation to deal with the logistic loss.
Quantifying uncertainties on excursion sets under a Gaussian random field prior
Azzimonti, Dario, Bect, Julien, Chevalier, Clรฉment, Ginsbourger, David
We focus on the problem of estimating and quantifying uncertainties on the excursion set of a function under a limited evaluation budget. We adopt a Bayesian approach where the objective function is assumed to be a realization of a Gaussian random field. In this setting, the posterior distribution on the objective function gives rise to a posterior distribution on excursion sets. Several approaches exist to summarize the distribution of such sets based on random closed set theory. While the recently proposed Vorob'ev approach exploits analytical formulae, further notions of variability require Monte Carlo estimators relying on Gaussian random field conditional simulations. In the present work we propose a method to choose Monte Carlo simulation points and obtain quasi-realizations of the conditional field at fine designs through affine predictors. The points are chosen optimally in the sense that they minimize the posterior expected distance in measure between the excursion set and its reconstruction. The proposed method reduces the computational costs due to Monte Carlo simulations and enables the computation of quasi-realizations on fine designs in large dimensions. We apply this reconstruction approach to obtain realizations of an excursion set on a fine grid which allow us to give a new measure of uncertainty based on the distance transform of the excursion set. Finally we present a safety engineering test case where the simulation method is employed to compute a Monte Carlo estimate of a contour line.
Bayesian inference in hierarchical models by combining independent posteriors
Dutta, Ritabrata, Blomstedt, Paul, Kaski, Samuel
Noname manuscript No. (will be inserted by the editor) Abstract Hierarchical models are versatile tools for joint modeling of data sets arising from different, but related, sources. Fully Bayesian inference may, however, become computationally prohibitive if the sourcespecific data models are complex, or if the number of sources is very large. To facilitate computation, we propose an approach, where inference is first made independently for the parameters of each data set, whereupon the obtained posterior samples are used as observed data in a substitute hierarchical model, based on a scaled likelihood function. Compared to direct inference in a full hierarchical model, the approach has the advantage of being able to speed up convergenceby breaking down the initial large inference problem into smaller individual subproblems with better convergence properties. Moreover it enables parallel processing of the possibly complex inferences of the source-specific parameters, which may otherwise create a computational bottleneck if processed jointly as part of a hierarchical model.
TRM: Computing Reputation Score by Mining Reviews
Xu, Guangquan (Tianjin University) | Cao, Yan (Tianjin University) | Zhang, Yao (Tianjin University) | Zhang, Gaoxu (Tianjin University) | Li, Xiaohong (Tianjin Unviersity) | Feng, Zhiyong (Tianjin University)
As the rapid development of e-commerce, reputation model has been proposed to help customers make effective purchase decisions. However, most of reputation models focus only on the overall ratings of products without considering reviews which provided by customers. We believe that textual reviews provided by buyers can express their real opinions more honestly. As so, in this paper, based on word2vector model, we propose a Textual Reputation Model (TRM) to obtain useful information from reviews, and evaluate the trustworthiness of objective product. Experimental results on real data demonstrate the effectiveness of our approach in capturing reputation information from reviews.
Active Inference and Dynamic Gaussian Bayesian Networks for Battery Optimization in Wireless Sensor Networks
Komurlu, Caner (Illinois Institute of Technology) | Bilgic, Mustafa (Illinois Institute of Technology)
Wireless sensor networks play a major role in smart grids and smart buildings. They are not just used for sensing, but they are also used as actuating. In terms of sensing they are used to measure temperature, humidity, light, to detect motion, etc. Sensors are often operated on a battery and hence we often face a trade-off between obtaining frequent sensor readings versus maximizing their battery life. There have been several approaches to maximizing their battery life from hardware level to software level such as reducing components energy consumption, limiting node operation capabilities, using power-aware routing protocols, and adding solar energy support. In this paper, we introduce a novel approach: we model the sensor readings in a wireless network using a dynamic Gaussian Bayesian network (dGBn) whose structure is automatically learned from data. dGBn allows us to integrate information across sensors and infer missing readings more accurately. Through active inference for dGBns, we are able to actively choose which sensors should be pulled for a reading and which ones can stay in a power-saving mode at each time step, maximizing prediction accuracy while staying within the budgetary constraints on battery consumption.