AITopics | gaussian dropout

We investigate a local reparameterizaton technique for greatly reducing the variance of stochastic gradients for variational Bayesian inference (SGVB) of a posterior over model parameters, while retaining parallelizability. This local repa-rameterization translates uncertainty about global parameters into local noise that is independent across datapoints in the minibatch. Such parameterizations can be trivially parallelized and have variance that is inversely proportional to the mini-batch size, generally leading to much faster convergence. Additionally, we explore a connection with dropout: Gaussian dropout objectives correspond to SGVB with local reparameterization, a scale-invariant prior and proportionally fixed posterior variance. Our method allows inference of more flexibly parameterized posteriors; specifically, we propose variational dropout, a generalization of Gaussian dropout where the dropout rates are learned, often leading to better models. The method is demonstrated through several experiments.

artificial intelligence, dropout, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Orange County > Irvine (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Add feedback

Variational Dropout and the Local Reparameterization Trick ⇤ † ⇤ Machine Learning Group, University of Amsterdam

Neural Information Processing SystemsMar-13-2024, 03:01:12 GMT

We investigate a local reparameterizaton technique for greatly reducing the variance of stochastic gradients for variational Bayesian inference (SGVB) of a posterior over model parameters, while retaining parallelizability. This local reparameterization translates uncertainty about global parameters into local noise that is independent across datapoints in the minibatch. Such parameterizations can be trivially parallelized and have variance that is inversely proportional to the minibatch size, generally leading to much faster convergence. Additionally, we explore a connection with dropout: Gaussian dropout objectives correspond to SGVB with local reparameterization, a scale-invariant prior and proportionally fixed posterior variance. Our method allows inference of more flexibly parameterized posteriors; specifically, we propose variational dropout, a generalization of Gaussian dropout where the dropout rates are learned, often leading to better models. The method is demonstrated through several experiments.

dropout, inference, neural network, (16 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.40)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Orange County > Irvine (0.04)

Add feedback

Dropout-Based Rashomon Set Exploration for Efficient Predictive Multiplicity Estimation

Hsu, Hsiang, Li, Guihong, Hu, Shaohan, Chun-Fu, null, Chen, null

arXiv.org Artificial IntelligenceFeb-1-2024

Predictive multiplicity refers to the phenomenon in which classification tasks may admit multiple competing models that achieve almost-equally-optimal performance, yet generate conflicting outputs for individual samples. This presents significant concerns, as it can potentially result in systemic exclusion, inexplicable discrimination, and unfairness in practical applications. Measuring and mitigating predictive multiplicity, however, is computationally challenging due to the need to explore all such almost-equally-optimal models, known as the Rashomon set, in potentially huge hypothesis spaces. To address this challenge, we propose a novel framework that utilizes dropout techniques for exploring models in the Rashomon set. We provide rigorous theoretical derivations to connect the dropout parameters to properties of the Rashomon set, and empirically evaluate our framework through extensive experimentation. Numerical results show that our technique consistently outperforms baselines in terms of the effectiveness of predictive multiplicity metric estimation, with runtime speedup up to $20\times \sim 5000\times$. With efficient Rashomon set exploration and metric estimation, mitigation of predictive multiplicity is then achieved through dropout ensemble and model selection.

dataset, predictive multiplicity metric, rashomon, (11 more...)

arXiv.org Artificial Intelligence

2402.00728

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > Canada > Ontario > Toronto (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Banking & Finance (1.00)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Dropout Regularization in Extended Generalized Linear Models based on Double Exponential Families

Schwienhorst, Benedikt Lütke, Kock, Lucas, Nott, David J., Klein, Nadja

arXiv.org Artificial IntelligenceMay-11-2023

Even though dropout is a popular regularization technique, its theoretical properties are not fully understood. In this paper we study dropout regularization in extended generalized linear models based on double exponential families, for which the dispersion parameter can vary with the features. A theoretical analysis shows that dropout regularization prefers rare but important features in both the mean and dispersion, generalizing an earlier result for conventional generalized linear models. Training is performed using stochastic gradient descent with adaptive learning rate. To illustrate, we apply dropout to adaptive smoothing with B-splines, where both the mean and dispersion parameters are modelled flexibly. The important B-spline basis functions can be thought of as rare features, and we confirm in experiments that dropout is an effective form of regularization for mean and dispersion parameters that improves on a penalized maximum likelihood approach with an explicit smoothness penalty.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.06625

Country:

North America > United States > New York (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre: Research Report (0.82)

Add feedback

Calibration of Model Uncertainty for Dropout Variational Inference

Laves, Max-Heinrich, Ihler, Sontje, Kortmann, Karl-Philipp, Ortmaier, Tobias

arXiv.org Machine LearningJun-20-2020

The model uncertainty obtained by variational Bayesian inference with Monte Carlo dropout is prone to miscalibration. In this paper, different logit scaling methods are extended to dropout variational inference to recalibrate model uncertainty. Expected uncertainty calibration error (UCE) is presented as a metric to measure miscalibration. The effectiveness of recalibration is evaluated on CIFAR-10/100 and SVHN for recent CNN architectures. Experimental results show that logit scaling considerably reduce miscalibration by means of UCE. Well-calibrated uncertainty enables reliable rejection of uncertain predictions and robust detection of out-of-distribution data.

calibration, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

2006.11584

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas (0.34)

Add feedback

Continuous Dropout

Shen, Xu, Tian, Xinmei, Liu, Tongliang, Xu, Fang, Tao, Dacheng

arXiv.org Machine LearningNov-28-2019

Dropout has been proven to be an effective algorithm for training robust deep networks because of its ability to prevent overfitting by avoiding the co-adaptation of feature detectors. Current explanations of dropout include bagging, naive Bayes, regularization, and sex in evolution. According to the activation patterns of neurons in the human brain, when faced with different situations, the firing rates of neurons are random and continuous, not binary as current dropout does. Inspired by this phenomenon, we extend the traditional binary dropout to continuous dropout. On the one hand, continuous dropout is considerably closer to the activation characteristics of neurons in the human brain than traditional binary dropout. On the other hand, we demonstrate that continuous dropout has the property of avoiding the co-adaptation of feature detectors, which suggests that we can extract more independent feature detectors for model averaging in the test stage. We introduce the proposed continuous dropout to a feedforward neural network and comprehensively compare it with binary dropout, adaptive dropout, and DropConnect on MNIST, CIFAR-10, SVHN, NORB, and ILSVRC-12. Thorough experiments demonstrate that our method performs better in preventing the co-adaptation of feature detectors and improves test performance. The code is available at: https://github.com/jasonustc/caffe-multigpu/tree/dropout.

continuous dropout, dropout, gaussian dropout, (12 more...)

arXiv.org Machine Learning

1911.12675

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China (0.04)
Oceania > Australia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Variational Bayesian Dropout

Liu, Yuhang, Dong, Wenyong, Zhang, Lei, Gong, Dong, Shi, Qinfeng

arXiv.org Machine LearningNov-20-2018

Variational dropout (VD) is a generalization of Gaussian dropout, which aims at inferring the posterior of network weights based on a log-uniform prior on them to learn these weights as well as dropout rate simultaneously. The log-uniform prior not only interprets the regularization capacity of Gaussian dropout in network training, but also underpins the inference of such posterior. However, the log-uniform prior is an improper prior (i.e., its integral is infinite) which causes the inference of posterior to be ill-posed, thus restricting the regularization performance of VD. To address this problem, we present a new generalization of Gaussian dropout, termed variational Bayesian dropout (VBD), which turns to exploit a hierarchical prior on the network weights and infer a new joint posterior. Specifically, we implement the hierarchical prior as a zero-mean Gaussian distribution with variance sampled from a uniform hyper-prior. Then, we incorporate such a prior into inferring the joint posterior over network weights and the variance in the hierarchical prior, with which both the network training and the dropout rate estimation can be cast into a joint optimization problem. More importantly, the hierarchical prior is a proper prior which enables the inference of posterior to be well-posed. In addition, we further show that the proposed VBD can be seamlessly applied to network compression. Experiments on both classification and network compression tasks demonstrate the superior performance of the proposed VBD in terms of regularizing network training.

artificial intelligence, dropout, machine learning, (18 more...)

arXiv.org Machine Learning

1811.07533

Country: Asia (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

[R] [1705.07832] Concrete Dropout -- learnable dropout probabilities!! • r/MachineLearning

@machinelearnbotMay-23-2017, 17:31:32 GMT

The original one, Variational Dropout and the Local Reparameterization Trick is cited in the Concrete Dropout paper and is indeed somewhat limited, however this issue is resolved in Variational Dropout Sparsifies Deep Neural Networks (accepted to ICML '17, paper from my labmates). They have very strange excuse to avoid comparison with the last paper (IMO both methods use different relaxations, it'd be useful to compare them face-to-face) We chose not to compare to Gaussian dropout in our experiments, as when optimising Gaussian dropout's α following its variational interpretation [23], the method is known to underperform [28] UPD: there's also Generalized Dropout (uses straight through estimator, which is not unbiased gradient estimator, and Information Dropout that does not use binary formulation.

artificial intelligence, dropout probability, machine learning, (6 more...)

@machinelearnbot

Industry: Media > News (0.40)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Variational Dropout and the Local Reparameterization Trick

Kingma, Durk P., Salimans, Tim, Welling, Max

Neural Information Processing SystemsDec-31-2015

We explore an as yet unexploited opportunity for drastically improving the efficiency of stochastic gradient variational Bayes (SGVB) with global model parameters. Regular SGVB estimators rely on sampling of parameters once per minibatch of data, and have variance that is constant w.r.t. the minibatch size. The efficiency of such estimators can be drastically improved upon by translating uncertainty about global parameters into local noise that is independent across datapoints in the minibatch. Such reparameterizations with local noise can be trivially parallelized and have variance that is inversely proportional to the minibatch size, generally leading to much faster convergence.We find an important connection with regularization by dropout: the original Gaussian dropout objective corresponds to SGVB with local noise, a scale-invariant prior and proportionally fixed posterior variance. Our method allows inference of more flexibly parameterized posteriors; specifically, we propose \emph{variational dropout}, a generalization of Gaussian dropout, but with a more flexibly parameterized posterior, often leading to better generalization. The method is demonstrated through several experiments.

Add feedback

Variational Dropout and the Local Reparameterization Trick

Kingma, Diederik P., Salimans, Tim, Welling, Max

arXiv.org Machine LearningDec-20-2015

We investigate a local reparameterizaton technique for greatly reducing the variance of stochastic gradients for variational Bayesian inference (SGVB) of a posterior over model parameters, while retaining parallelizability. This local reparameterization translates uncertainty about global parameters into local noise that is independent across datapoints in the minibatch. Such parameterizations can be trivially parallelized and have variance that is inversely proportional to the minibatch size, generally leading to much faster convergence. Additionally, we explore a connection with dropout: Gaussian dropout objectives correspond to SGVB with local reparameterization, a scale-invariant prior and proportionally fixed posterior variance. Our method allows inference of more flexibly parameterized posteriors; specifically, we propose variational dropout, a generalization of Gaussian dropout where the dropout rates are learned, often leading to better models. The method is demonstrated through several experiments.

artificial intelligence, dropout, machine learning, (20 more...)

arXiv.org Machine Learning

1506.02557

Country: North America (0.46)

Genre: Research Report (0.50)

Add feedback

Filters

Collaborating Authors

gaussian dropout

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Variational Dropout and the Local Reparameterization Trick

Variational Dropout and the Local Reparameterization Trick ⇤ † ⇤ Machine Learning Group, University of Amsterdam

Dropout-Based Rashomon Set Exploration for Efficient Predictive Multiplicity Estimation

Dropout Regularization in Extended Generalized Linear Models based on Double Exponential Families

Calibration of Model Uncertainty for Dropout Variational Inference

Continuous Dropout

Variational Bayesian Dropout

[R] [1705.07832] Concrete Dropout -- learnable dropout probabilities!! • r/MachineLearning

Variational Dropout and the Local Reparameterization Trick

Variational Dropout and the Local Reparameterization Trick