AITopics

1906.05082

Country:

North America > United States > California (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Haussmann, Manuel, Hamprecht, Fred A., Kandemir, Melih

Sampling-Free Variational Inference of Bayesian Neural Networks by Variance Backpropagation

arXiv.org Machine LearningJun-12-2019

We propose a new Bayesian Neural Net formulation that affords variational inference for which the evidence lower bound is analytically tractable subject to a tight approximation. We achieve this tractability by (i) decomposing ReLU nonlinearities into the product of an identity and a Heaviside step function, (ii) introducing a separate path that decomposes the neural net expectation from its variance. We demonstrate formally that introducing separate latent binary variables to the activations allows representing the neural network likelihood as a chain of linear operations. Performing variational inference on this construction enables a sampling-free computation of the evidence lower bound which is a more effective approximation than the widely applied Monte Carlo sampling and CLT related techniques. We evaluate the model on a range of regression and classification tasks against BNN inference alternatives, showing competitive or improved performance over the current state-of-the-art.

neural network, variance, variational inference, (15 more...)

1805.07654

Country:

Europe > Germany (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Harvey, William, Teng, Michael, Wood, Frank

Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

arXiv.org Machine LearningJun-12-2019

We introduce the use of Bayesian optimal experimental design techniques for generating glimpse sequences to use in semi-supervised training of hard attention networks. Hard attention holds the promise of greater energy efficiency and superior inference performance. Employing such networks for image classification usually involves choosing a sequence of glimpse locations from a stochastic policy. As the outputs of observations are typically non-differentiable with respect to their glimpse locations, unsupervised gradient learning of such a policy requires REINFORCE-style updates. Also, the only reward signal is the final classification accuracy. For these reasons hard attention networks, despite their promise, have not achieved the wide adoption that soft attention networks have and, in many practical settings, are difficult to train. We find that our method for semi-supervised training makes it easier and faster to train hard attention networks and correspondingly could make them practical to consider in situations where they were not before.

artificial intelligence, bayesian inference, machine learning, (14 more...)

1906.05462

Country:

North America > Canada (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

arXiv.org Machine LearningJun-11-2019

Replica-exchange Nos\'e-Hoover dynamics for Bayesian learning on large datasets

Luo, Rui, Zhang, Qiang, Yang, Yaodong, Wang, Jun

In this paper, we propose a new sampler for Bayesian learning that can efficiently draw representative samples from complex posterior distributions with multiple isolated modes in the presence of mini-batch noise. This is done by simulating a collection of replicas in parallel with different temperatures. When evolving the Nos\'e-Hoover dynamics, the sampler adaptively neutralizes the mini-batch noise. To approximate the detailed balance, configuration exchange is performed periodically between adjacent replicas according to a noise-aware test of acceptance. While its effectiveness on complex multimodal posteriors has been illustrated by testing over synthetic distributions, experiments on deep Bayesian neural network learning have shown its significant improvements over strong baselines for image classification.

artificial intelligence, machine learning, replica, (19 more...)

1905.12569

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Gianniotis, Nikolaos, Schnörr, Christoph, Molkenthin, Christian, Bora, Sanjay Singh

Approximate Variational Inference Based on a Finite Sample of Gaussian Latent Variables

arXiv.org Machine LearningJun-11-2019

Variational methods are employed in situations where exact Bayesian inference becomes intractable due to the difficulty in performing certain integrals. Typically, variational methods postulate a tractable posterior and formulate a lower bound on the desired integral to be approximated, e.g. marginal likelihood. The lower bound is then optimised with respect to its free parameters, the so called variational parameters. However, this is not always possible as for certain integrals it is very challenging (or tedious) to come up with a suitable lower bound. Here we propose a simple scheme that overcomes some of the awkward cases where the usual variational treatment becomes difficult. The scheme relies on a rewriting of the lower bound on the model log-likelihood. We demonstrate the proposed scheme on a number of synthetic and real examples, as well as on a real geophysical model for which the standard variational approaches are inapplicable.

approximation, artificial intelligence, machine learning, (17 more...)

doi: 10.1007/s10044-015-0496-9

1906.04507

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

Rodrigues, Filipe, Ortelli, Nicola, Bierlaire, Michel, Pereira, Francisco

Bayesian Automatic Relevance Determination for Utility Function Specification in Discrete Choice Models

Specifying utility functions is a key step towards applying the discrete choice framework for understanding the behaviour processes that govern user choices. However, identifying the utility function specifications that best model and explain the observed choices can be a very challenging and time-consuming task. This paper seeks to help modellers by leveraging the Bayesian framework and the concept of automatic relevance determination (ARD), in order to automatically determine an optimal utility function specification from an exponentially large set of possible specifications in a purely data-driven manner. Based on recent advances in approximate Bayesian inference, a doubly stochastic variational inference is developed, which allows the proposed DCM-ARD model to scale to very large and high-dimensional datasets. Using semi-artificial choice data, the proposed approach is shown to very accurately recover the true utility function specifications that govern the observed choices. Moreover, when applied to real choice data, DCM-ARD is shown to be able discover high quality specifications that can outperform previous ones from the literature according to multiple criteria, thereby demonstrating its practical applicability.

game theory, specification, survey article, (20 more...)

1906.03855

Country: Europe > Denmark (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.93)
Materials > Chemicals > Industrial Gases > Liquified Gas (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.67)
Energy > Oil & Gas > Midstream (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(2 more...)

Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons

Wang, Jingyan, Shah, Nihar B., Ravi, R.

A number of applications (e.g., AI bot tournaments, sports, peer grading, crowdsourcing) use pairwise comparison data and the Bradley-Terry-Luce (BTL) model to evaluate a given collection of items (e.g., bots, teams, students, search results). Past work has shown that under the BTL model, the widely-used maximum-likelihood estimator (MLE) is minimax-optimal in estimating the item parameters, in terms of the mean squared error. However, another important desideratum for designing estimators is fairness. In this work, we consider fairness modeled by the notion of bias in statistics. We show that the MLE incurs a suboptimal rate in terms of bias. We then propose a simple modification to the MLE, which "stretches" the bounding box of the maximum-likelihood optimizer by a small constant factor from the underlying ground truth domain. We show that this simple modification leads to an improved rate in bias, while maintaining minimax-optimality in the mean squared error. In this manner, our proposed class of estimators provably improves fairness represented by bias without loss in accuracy.

artificial intelligence, machine learning, theorem 2, (19 more...)

1906.04066

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Rodrigues, G. S., Nott, D. J., Sisson, S. A.

Likelihood-free approximate Gibbs sampling

Likelihood-free methods refer to procedures that perform likelihood-based statistical inference, but without direct evaluation of the likelihood function. This is attractive when the likelihood function is computationally prohibitive to evaluate due to dataset size or model complexity, or when the likelihood function is only known through a data generation process. Some classes of likelihood-free methods include pseudo-marginal methods (Beaumont 2003; Andrieu and Roberts 2009), indirect inference (Gourieroux et al. 1993) and approximate Bayesian computation (Sisson et al. 2018a). In particular, approximate Bayesian computation (ABC) methods form an approximation to the computationally intractable posterior distribution by firstly sampling parameter vectors from the prior, and conditional on these, generating synthetic datasets under the model. The parameter vectors are then weighted by how well a vector of summary statistics of the synthetic datasets matches the same summary statistics of the observed data. ABC methods have seen extensive application and development over the past 15 years.

artificial intelligence, conditional distribution, machine learning, (16 more...)

1906.04347

Country:

South America > Brazil > Federal District > Brasília (0.04)
Oceania > New Zealand (0.04)
Oceania > Australia > Queensland (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Streaming Variational Monte Carlo

Zhao, Yuan, Nassar, Josue, Jordan, Ian, Bugallo, Mónica, Park, Il Memming

Nonlinear state-space models are powerful tools to describe dynamical structures in complex time series. In a streaming setting where data are processed one sample at a time, simultaneously inferring the state and their nonlinear dynamics has posed significant challenges in practice. We develop a novel online learning framework, leveraging variational inference and sequential Monte Carlo, which enables flexible and accurate Bayesian joint filtering. Our method provides a filtering posterior arbitrarily close to the true filtering distribution for a wide class of dynamics models and observation models. Specifically, the proposed framework can efficiently infer a posterior over the dynamics using sparse Gaussian processes. Constant time complexity per sample makes our approach amenable to online learning scenarios and suitable for real-time applications.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1906.01549

Country:

Asia > Middle East > Jordan (0.14)
North America > United States > New York > Suffolk County > Stony Brook (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Education (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Herta, Christian, Voigt, Benjamin

Radial Prediction Layer

For a broad variety of critical applications, it is essential to know how confident a classification prediction is. In this paper, we discuss the drawbacks of softmax to calculate class probabilities and to handle uncertainty in Bayesian neural networks. We introduce a new kind of prediction layer called radial prediction layer (RPL) to overcome these issues. In contrast to the softmax classification, RPL is based on the open-world assumption. Therefore, the class prediction probabilities are much more meaningful to assess the uncertainty concerning the novelty of the input. We show that neural networks with RPLs can be learned in the same way as neural networks using softmax. On a 2D toy data set (spiral data), we demonstrate the fundamental principles and advantages. On the real-world ImageNet data set, we show that the open-world properties are beneficially fulfilled. Additionally, we show that RPLs are less sensible to adversarial attacks on the MNIST data set. Due to its features, we expect RPL to be beneficial in a broad variety of applications, especially in critical environments, such as medicine or autonomous driving.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1905.1115

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.68)
Information Technology (0.68)
Government (0.48)
Transportation > Ground (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)