whvi
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (3 more...)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Asia > China (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)
Supplement material for " Walsh-Hadamard Variational Inference for Bayesian Deep Learning " Simone Rossi
(the cube in Figure 1). For each dimension, the orange dots represent 20 repetitions. The median distance is displayed in black. Few outliers (with distance greater than 3.0) appeared, possibly due to imperfect numerical optimization. Results are reported in Table 2. Comparison of test error w.r.t. the number model parameters ( top: mean field, bottom: full covariance).
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (3 more...)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Asia > China (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)
First of all we would like to thank the Reviewers for their valuable comments and useful suggestions, which will be
Below, we address the main points raised by the Reviewers. "I'm not sure how well this is aligned with the goal of the paper" "I would have liked [...] evaluation of uncertainty calibration" "The paper does not compare with [...] deep ensembles [...] non-Bayesian "Why did you use SGHMC instead of full HMC?" But for this simple example we don't expect it to make big difference (the R-hat statistic showed its convergence). We will add some traces and the setup in the supplement. "Why do you assume a fully factorized Gaussian posterior Our parameterization in case of output dimension 1 will be equivalent to mean field.
Review for NeurIPS paper: Walsh-Hadamard Variational Inference for Bayesian Deep Learning
Weaknesses: The main weakness I see with this paper is its empirical evaluation, which could be more convincing. While the experiments on CNNs show that WHVI is competitive with other approaches on VGG16 while being more parameter efficient (which is impressive), I am not sure how well this is aligned with the goal of the paper. I was under the impression that the goal of the paper was to improve Bayesian inference in deep neural networks (for which I would expect stronger results), but instead the goal might be to reduce the number of model parameters without sacrificing accuracy -- it would be great if the authors could clarify this. Furthermore, I would have liked to see a more extensive evaluation of uncertainty calibration, both in in-domain and especially out-of-domain settings, using e.g. the benchmarks proposed in Ovadia et al. 2019, which would further strenghten the paper. Also, the paper does not compare against state-of-the-art methods for deep uncertainty quantification such as deep ensembles (Lakshminarayanan et al. 2017, Ovadia et al. 2019), which makes it hard to assess the potential impact of the proposed approach.
Efficient Approximate Inference with Walsh-Hadamard Variational Inference
Rossi, Simone, Marmin, Sebastien, Filippone, Maurizio
Variational inference offers scalable and flexible tools to tackle intractable Bayesian inference of modern statistical models like Bayesian neural networks and Gaussian processes. For largely over-parameterized models, however, the over-regularization property of the variational objective makes the application of variational inference challenging. Inspired by the literature on kernel methods, and in particular on structured approximations of distributions of random matrices, this paper proposes Walsh-Hadamard Variational Inference, which uses Walsh-Hadamard-based factorization strategies to reduce model parameterization, accelerate computations, and increase the expressiveness of the approximate posterior beyond fully factorized ones.
- Oceania > Australia > New South Wales > Sydney (0.05)
- Europe > France > Hauts-de-France > Nord > Lille (0.05)
- Asia > Middle East > Jordan (0.05)
- (5 more...)
Walsh-Hadamard Variational Inference for Bayesian Deep Learning
Rossi, Simone, Marmin, Sebastien, Filippone, Maurizio
Over-parameterized models, such as DeepNets and ConvNets, form a class of models that are routinely adopted in a wide variety of applications, and for which Bayesian inference is desirable but extremely challenging. Variational inference offers the tools to tackle this challenge in a scalable way and with some degree of flexibility on the approximation, but for over-parameterized models this is challenging due to the over-regularization property of the variational objective. Inspired by the literature on kernel methods, and in particular on structured approximations of distributions of random matrices, this paper proposes Walsh-Hadamard Variational Inference (WHVI), which uses Walsh-Hadamard-based factorization strategies to reduce the parameterization and accelerate computations, thus avoiding over-regularization issues with the variational objective. Extensive theoretical and empirical analyses demonstrate that WHVI yields considerable speedups and model reductions compared to other techniques to carry out approximate inference for over-parameterized models, and ultimately show how advances in kernel methods can be translated into advances in approximate Bayesian inference.
- North America > United States > New York > New York County > New York City (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)