AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

andrewgordonwilson/bayesgan

@machinelearnbotNov-13-2017, 15:10:06 GMT

This repository contains the Tensorflow implementation of the Bayesian GAN by Yunus Saatchi and Andrew Gordon Wilson. This paper will be appearing at NIPS 2017. In the Bayesian GAN we propose conditional posteriors for the generator and discriminator weights, and marginalize these posteriors through stochastic gradient Hamiltonian Monte Carlo. Key properties of the Bayesian approach to GANs include (1) accurate predictions on semi-supervised learning problems; (2) minimal intervention for good performance; (3) a probabilistic formulation for inference in response to adversarial feedback; (4) avoidance of mode collapse; and (5) a representation of multiple complementary generative and discriminative models for data, forming a probabilistic ensemble. We illustrate a multimodal posterior over the parameters of the generator.

artificial intelligence, bayesian inference, machine learning, (18 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Add feedback

Model Criticism in Latent Space

Seth, Sohan, Murray, Iain, Williams, Christopher K. I.

arXiv.org Machine LearningNov-13-2017

The extended model(s) can again be subjected to criticism, and the process continues until a satisfactory model is found (O'Hagan, 2003). Model criticism is contrasted with model comparison in that model criticism assesses a single model, while model comparison deals with at least two models to decide which model is a better fit. Model comparison can be applied to compare the original and the extended model after model criticism and extension (O'Hagan, 2003, p. 2). Most work on model criticism makes use of the idea that "if the model fits, then replicated data generated under the model should look similar to observed data" (Gelman et al., 2004, p. 165). In contrast, in this paper we focus on the idea that for latent variable models, we can probabilistically pull back the data into the space of the latent variables, and carry out model criticism in that space.

artificial intelligence, machine learning, model criticism, (16 more...)

arXiv.org Machine Learning

1711.04674

Country: Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference

Huggins, Jonathan H., Adams, Ryan P., Broderick, Tamara

arXiv.org Machine LearningNov-13-2017

Generalized linear models (GLMs) -- such as logistic regression, Poisson regression, and robust regression -- provide interpretable models for diverse data types. Probabilistic approaches, particularly Bayesian ones, allow coherent estimates of uncertainty, incorporation of prior information, and sharing of power across experiments via hierarchical models. In practice, however, the approximate Bayesian methods necessary for inference have either failed to scale to large data sets or failed to provide theoretical guarantees on the quality of inference. We propose a new approach based on constructing polynomial approximate sufficient statistics for GLMs (PASS-GLM). We demonstrate that our method admits a simple algorithm as well as trivial streaming and distributed extensions that do not compound error across computations. We provide theoretical guarantees on the quality of point (MAP) estimates, the approximate posterior, and posterior mean and uncertainty estimates. We validate our approach empirically in the case of logistic regression using a quadratic approximation and show competitive performance with stochastic gradient descent, MCMC, and the Laplace approximation in terms of speed and multiple measures of accuracy -- including on an advertising data set with 40 million data points and 20,000 covariates.

approximation, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1709.09216

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Learning Disentangled Representations with Semi-Supervised Deep Generative Models

Siddharth, N., Paige, Brooks, van de Meent, Jan-Willem, Desmaison, Alban, Goodman, Noah D., Kohli, Pushmeet, Wood, Frank, Torr, Philip H. S.

arXiv.org Machine LearningNov-13-2017

Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network. Typically these models encode all features of the data into a single variable. Here we are interested in learning disentangled representations that encode distinct aspects of the data into separate variables. We propose to learn such representations using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder. This allows us to train partially-specified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks to learn representations for the remaining variables. We further define a general objective for semi-supervised learning in this model class, which can be approximated using an importance sampling procedure. We evaluate our framework's ability to learn disentangled representations, both by qualitative exploration of its generative capacity, and quantitative evaluation of its discriminative ability on a variety of models and datasets.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Machine Learning

1706.004

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)

Add feedback

Deep Learning: A Bayesian Perspective

Polson, Nicholas, Sokolov, Vadim

arXiv.org Machine LearningNov-13-2017

Deep learning is a form of machine learning for nonlinear high dimensional pattern matching and prediction. By taking a Bayesian probabilistic perspective, we provide a number of insights into more efficient algorithms for optimisation and hyper-parameter tuning. Traditional high-dimensional data reduction techniques, such as principal component analysis (PCA), partial least squares (PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are all shown to be shallow learners. Their deep learning counterparts exploit multiple deep layers of data reduction which provide predictive performance gains. Stochastic gradient descent (SGD) training optimisation and Dropout (DO) regularization provide estimation and variable selection. Bayesian regularization is central to finding weights and connections in networks to optimize the predictive bias-variance trade-off. To illustrate our methodology, we provide an analysis of international bookings on Airbnb. Finally, we conclude with directions for future research.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Machine Learning

doi: 10.1214/17-BA1082

1706.00473

Country:

North America > United States > New York (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Consumer Products & Services (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Alpha-Divergences in Variational Dropout

Mazoure, Bogdan, Islam, Riashat

arXiv.org Machine LearningNov-12-2017

We investigate the use of alternative divergences to Kullback-Leibler (KL) in variational inference(VI), based on the Variational Dropout \cite{kingma2015}. Stochastic gradient variational Bayes (SGVB) \cite{aevb} is a general framework for estimating the evidence lower bound (ELBO) in Variational Bayes. In this work, we extend the SGVB estimator with using Alpha-Divergences, which are alternative to divergences to VI' KL objective. The Gaussian dropout can be seen as a local reparametrization trick of the SGVB objective. We extend the Variational Dropout to use alpha divergences for variational inference. Our results compare $\alpha$-divergence variational dropout with standard variational dropout with correlated and uncorrelated weight noise. We show that the $\alpha$-divergence with $\alpha \rightarrow 1$ (or KL divergence) is still a good measure for use in variational inference, in spite of the efficient use of Alpha-divergences for Dropout VI \cite{Li17}. $\alpha \rightarrow 1$ can yield the lowest training error, and optimizes a good lower bound for the evidence lower bound (ELBO) among all values of the parameter $\alpha \in [0,\infty)$.

artificial intelligence, divergence, machine learning, (15 more...)

arXiv.org Machine Learning

1711.04345

Country:

Oceania > Australia (0.14)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

Filtering Variational Objectives

Maddison, Chris J., Lawson, Dieterich, Tucker, George, Heess, Nicolas, Norouzi, Mohammad, Mnih, Andriy, Doucet, Arnaud, Teh, Yee Whye

arXiv.org Machine LearningNov-12-2017

When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results. Inspired by this, we consider the extension of the ELBO to a family of lower bounds defined by a particle filter's estimator of the marginal likelihood, the filtering variational objectives (FIVOs). FIVOs take the same arguments as the ELBO, but can exploit a model's sequential structure to form tighter bounds. We present results that relate the tightness of FIVO's bound to the variance of the particle filter's estimator by considering the generic case of bounds defined as log-transformed likelihood estimators. Experimentally, we show that training with FIVO results in substantial improvements over training the same model architecture with the ELBO on sequential data.

artificial intelligence, fivo, machine learning, (17 more...)

arXiv.org Machine Learning

1705.09279

Country: Europe > United Kingdom > England (0.28)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Industry:

Media > Music (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Add feedback

Bayesian Belief Updating of Spatiotemporal Seizure Dynamics

Cooray, Gerald K, Rosch, Richard, Baldeweg, Torsten, Lemieux, Louis, Friston, Karl, Sengupta, Biswa

arXiv.org Machine LearningNov-12-2017

Epileptic seizure activity shows complicated dynamics in both space and time. To understand the evolution and propagation of seizures spatially extended sets of data need to be analysed. We have previously described an efficient filtering scheme using variational Laplace that can be used in the Dynamic Causal Modelling (DCM) framework [Friston, 2003] to estimate the temporal dynamics of seizures recorded using either invasive or non-invasive electrical recordings (EEG/ECoG). Spatiotemporal dynamics are modelled using a partial differential equation -- in contrast to the ordinary differential equation used in our previous work on temporal estimation of seizure dynamics [Cooray, 2016]. We provide the requisite theoretical background for the method and test the ensuing scheme on simulated seizure activity data and empirical invasive ECoG data. The method provides a framework to assimilate the spatial and temporal dynamics of seizure activity, an aspect of great physiological and clinical importance.

artificial intelligence, machine learning, seizure activity, (15 more...)

arXiv.org Machine Learning

1705.07278

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology > Epilepsy (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Add feedback

Streaming Sparse Gaussian Process Approximations

Bui, Thang D., Nguyen, Cuong V., Turner, Richard E.

arXiv.org Machine LearningNov-12-2017

Sparse pseudo-point approximations for Gaussian process (GP) models provide a suite of methods that support deployment of GPs in the large data regime and enable analytic intractabilities to be sidestepped. However, the field lacks a principled method to handle streaming data in which both the posterior distribution over function values and the hyperparameter estimates are updated in an online fashion. The small number of existing approaches either use suboptimal hand-crafted heuristics for hyperparameter learning, or suffer from catastrophic forgetting or slow updating when new data arrive. This paper develops a new principled framework for deploying Gaussian process probabilistic models in the streaming setting, providing methods for learning hyperparameters and optimising pseudo-input locations. The proposed framework is assessed using synthetic and real-world datasets.

artificial intelligence, machine learning, posterior, (17 more...)

arXiv.org Machine Learning

1705.07131

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Modeling & Simulation (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Uncertainty Decomposition in Bayesian Neural Networks with Latent Variables

Depeweg, Stefan, Hernández-Lobato, José Miguel, Doshi-Velez, Finale, Udluft, Steffen

arXiv.org Machine LearningNov-11-2017

Bayesian neural networks (BNNs) with latent variables are probabilistic models which can automatically identify complex stochastic patterns in the data. We describe and study in these models a decomposition of predictive uncertainty into its epistemic and aleatoric components. First, we show how such a decomposition arises naturally in a Bayesian active learning scenario by following an information theoretic approach. Second, we use a similar decomposition to develop a novel risk sensitive objective for safe reinforcement learning (RL). This objective minimizes the effect of model bias in environments whose stochastic dynamics are described by BNNs with latent variables. Our experiments illustrate the usefulness of the resulting decomposition in active learning and safe RL settings.

artificial intelligence, latent variable, machine learning, (14 more...)

arXiv.org Machine Learning

1706.08495

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback