AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

$\alpha$-Variational Inference with Statistical Guarantees

Yang, Yun, Pati, Debdeep, Bhattacharya, Anirban

arXiv.org Machine LearningFeb-7-2018

We propose a family of variational approximations to Bayesian posterior distributions, called $\alpha$-VB, with provable statistical guarantees. The standard variational approximation is a special case of $\alpha$-VB with $\alpha=1$. When $\alpha \in(0,1]$, a novel class of variational inequalities are developed for linking the Bayes risk under the variational approximation to the objective function in the variational optimization problem, implying that maximizing the evidence lower bound in variational inference has the effect of minimizing the Bayes risk within the variational density family. Operating in a frequentist setup, the variational inequalities imply that point estimates constructed from the $\alpha$-VB procedure converge at an optimal rate to the true parameter in a wide range of problems. We illustrate our general theory with a number of examples, including the mean-field variational approximation to (low)-high-dimensional Bayesian linear regression with spike and slab priors, mixture of Gaussian models, latent Dirichlet allocation, and (mixture of) Gaussian variational approximation in regular parametric models.

approximation, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1710.03266

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Modelling Preference Data with the Wallenius Distribution

Grazian, Clara, Leisen, Fabrizio, Liseo, Brunero

arXiv.org Machine LearningFeb-7-2018

The Wallenius distribution is a generalisation of the Hypergeometric distribution where weights are assigned to balls of different colours. This naturally defines a model for ranking categories which can be used for classification purposes. Since, in general, the resulting likelihood is not analytically available, we adopt an approximate Bayesian computational (ABC) approach for estimating the importance of the categories. We illustrate the performance of the estimation procedure on simulated datasets. Finally, we use the new model for analysing two datasets about movies ratings and Italian academic statisticians' journal preferences. The latter is a novel dataset collected by the authors.

artificial intelligence, machine learning, wallenius distribution, (18 more...)

arXiv.org Machine Learning

1701.08142

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.69)
Information Technology > Data Science (0.68)
(2 more...)

Add feedback

Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data

McDermott, Patrick L., Wikle, Christopher K.

arXiv.org Machine LearningFeb-6-2018

Recurrent neural networks (RNNs) are nonlinear dynamical models commonly used in the machine learning and dynamical systems literature to represent complex dynamical or sequential relationships between variables. More recently, as deep learning models have become more common, RNNs have been used to forecast increasingly complicated systems. Dynamical spatio-temporal processes represent a class of complex systems that can potentially benefit from these types of models. Although the RNN literature is expansive and highly developed, uncertainty quantification is often ignored. Even when considered, the uncertainty is generally quantified without the use of a rigorous framework, such as a fully Bayesian setting. Here we attempt to quantify uncertainty in a more formal framework while maintaining the forecast accuracy that makes these models appealing, by presenting a Bayesian RNN model for nonlinear spatio-temporal forecasting. Additionally, we make simple modifications to the basic RNN to help accommodate the unique nature of nonlinear spatio-temporal data. The proposed model is applied to a Lorenz simulation and two real-world nonlinear spatio-temporal forecasting applications.

artificial intelligence, bast-rnn model, machine learning, (20 more...)

arXiv.org Machine Learning

1711.00636

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent

Campbell, Trevor, Broderick, Tamara

arXiv.org Machine LearningFeb-5-2018

Coherent uncertainty quantification is a key strength of Bayesian methods. But modern algorithms for approximate Bayesian posterior inference often sacrifice accurate posterior uncertainty estimation in the pursuit of scalability. This work shows that previous Bayesian coreset construction algorithms---which build a small, weighted subset of the data that approximates the full dataset---are no exception. We demonstrate that these algorithms scale the coreset log-likelihood suboptimally, resulting in underestimated posterior uncertainty. To address this shortcoming, we develop greedy iterative geodesic ascent (GIGA), a novel algorithm for Bayesian coreset construction that scales the coreset log-likelihood optimally. GIGA provides geometric decay in posterior approximation error as a function of coreset size, and maintains the fast running time of its predecessors. The paper concludes with validation of GIGA on both synthetic and real datasets, demonstrating that it reduces posterior approximation error by orders of magnitude compared with previous coreset constructions.

artificial intelligence, bayesian inference, machine learning, (12 more...)

arXiv.org Machine Learning

1802.01737

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Weakly-supervised Dictionary Learning

You, Zeyu, Raich, Raviv, Fern, Xiaoli Z., Kim, Jinsub

arXiv.org Machine LearningFeb-5-2018

We present a probabilistic modeling and inference framework for discriminative analysis dictionary learning under a weak supervision setting. Dictionary learning approaches have been widely used for tasks such as low-level signal denoising and restoration as well as high-level classification tasks, which can be applied to audio and image analysis. Synthesis dictionary learning aims at jointly learning a dictionary and corresponding sparse coefficients to provide accurate data representation. This approach is useful for denoising and signal restoration, but may lead to sub-optimal classification performance. By contrast, analysis dictionary learning provides a transform that maps data to a sparse discriminative representation suitable for classification. We consider the problem of analysis dictionary learning for time-series data under a weak supervision setting in which signals are assigned with a global label instead of an instantaneous label signal. We propose a discriminative probabilistic model that incorporates both label information and sparsity constraints on the underlying latent instantaneous label signal using cardinality control. We present the expectation maximization (EM) procedure for maximum likelihood estimation (MLE) of the proposed model. To facilitate a computationally efficient E-step, we propose both a chain and a novel tree graph reformulation of the graphical model. The performance of the proposed model is demonstrated on both synthetic and real-world data.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

1802.01709

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Fast and accurate approximation of the full conditional for gamma shape parameters

Miller, Jeffrey W.

arXiv.org Machine LearningFeb-5-2018

The gamma distribution arises frequently in Bayesian models, but there is not an easy-to-use conjugate prior for the shape parameter of a gamma. This inconvenience is usually dealt with by using either Metropolis-Hastings moves, rejection sampling methods, or numerical integration. However, in models with a large number of shape parameters, these existing methods are slower or more complicated than one would like, making them burdensome in practice. It turns out that the full conditional distribution of the gamma shape parameter is well approximated by a gamma distribution, even for small sample sizes. This article introduces a quick and easy algorithm for finding a gamma distribution that approximates the full conditional distribution of the shape parameter. We empirically demonstrate the speed and accuracy of the approximation across a wide range of conditions. If exactness is required, the approximation can be used as a proposal distribution for Metropolis-Hastings.

approximation, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1802.0161

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Understanding Objective Functions in Neural Networks

@machinelearnbotFeb-2-2018, 23:56:14 GMT

The main inspiration for this blog post is based on the work I did on Bayesian Neural Networks with my friend Brian Trippe at the Computational and Biological Learning Lab in Cambridge University. I highly recommend anyone to read Brian's thesis on variational inference in neural networks. Disclaimer: At the Computational and Biological Learning Lab Bayesian machine learning techniques are unapologetically taught as the way forward. As such, be aware of potential bias in this blog post. For example in image classification, x represents an image and y the corresponding image label.

artificial intelligence, bayesian inference, machine learning, (16 more...)

@machinelearnbot

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.25)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Add feedback

Bayesian Renewables Scenario Generation via Deep Generative Networks

Chen, Yize, Li, Pan, Zhang, Baosen

arXiv.org Machine LearningFeb-2-2018

We present a method to generate renewable scenarios using Bayesian probabilities by implementing the Bayesian generative adversarial network~(Bayesian GAN), which is a variant of generative adversarial networks based on two interconnected deep neural networks. By using a Bayesian formulation, generators can be constructed and trained to produce scenarios that capture different salient modes in the data, allowing for better diversity and more accurate representation of the underlying physical process. Compared to conventional statistical models that are often hard to scale or sample from, this method is model-free and can generate samples extremely efficiently. For validation, we use wind and solar times-series data from NREL integration data sets to train the Bayesian GAN. We demonstrate that proposed method is able to generate clusters of wind scenarios with different variance and mean value, and is able to distinguish and generate wind and solar scenarios simultaneously even if the historical data are intentionally mixed.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

1802.00868

Country: North America > United States (0.69)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

VIBNN: Hardware Acceleration of Bayesian Neural Networks

Cai, Ruizhe, Ren, Ao, Liu, Ning, Ding, Caiwen, Wang, Luhao, Qian, Xuehai, Pedram, Massoud, Wang, Yanzhi

arXiv.org Machine LearningFeb-2-2018

Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue commonly seen in conventional neural networks and allow for smalldata training, through the variational inference process. Frequent usage of Gaussian random variables in this process requires a properly optimized Gaussian Random Number Generator (GRNG). The high hardware cost of conventional GRNG makes the hardware implementation of BNNs challenging. In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs. We explore the design space for massive amount of Gaussian variable sampling tasks in BNNs. Specifically, we introduce two high performance Gaussian (pseudo) random number generators: 1) the RAMbased Linear Feedback Gaussian Random Number Generator (RLF-GRNG), which is inspired by the properties of binomial distribution and linear feedback logics; and 2) the Bayesian Neural Network-oriented Wallace Gaussian Random Number Generator. To achieve high scalability and efficient memory access, we propose a deep pipelined accelerator architecture with fast execution and good hardware utilization. Experimental results demonstrate that the proposed VIBNN implementations on an FPGA can achieve throughput of 321,543.4

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1145/3173162.3173212

1802.00822

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

An Instability in Variational Inference for Topic Models

Ghorbani, Behrooz, Javadi, Hamid, Montanari, Andrea

arXiv.org Machine LearningFeb-2-2018

Topic models are Bayesian models that are frequently used to capture the latent structure of certain corpora of documents or images. Each data element in such a corpus (for instance each item in a collection of scientific articles) is regarded as a convex combination of a small number of vectors corresponding to `topics' or `components'. The weights are assumed to have a Dirichlet prior distribution. The standard approach towards approximating the posterior is to use variational inference algorithms, and in particular a mean field approximation. We show that this approach suffers from an instability that can produce misleading conclusions. Namely, for certain regimes of the model parameters, variational inference outputs a non-trivial decomposition into topics. However --for the same parameter values-- the data contain no actual information about the true decomposition, and hence the output of the algorithm is uncorrelated with the true topic decomposition. Among other consequences, the estimated posterior mean is significantly wrong, and estimated Bayesian credible regions do not achieve the nominal coverage. We discuss how this instability is remedied by more accurate mean field approximations.

free energy, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1802.00568

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.85)

Add feedback