Goto

Collaborating Authors

 ariational inference




Supplementary Document to the Paper " Efficient V ariational Inference for Sparse Deep Learning with Theoretical Guarantee "

Neural Information Processing Systems

As a technical tool for the proof, we first restate the Lemma 6.1 in Chérief-Abdellatif and Alquier The first inequality is due to Lemma 1.1 and the second Under Condition 4.1 - 4.2, we have the following lemma that shows the existence of testing functions Now we define φ " max Note that log K " log N pε Hence we conclude the proof. We start with the first component. Pati et al. (2018), it could be shown ż 's, the third term in the RHS of (9) is bounded by 3 2nσ Similarly, the fifth term in the RHS of (9) is bounded by O p 1{n q. The convergence under squared Hellinger distance is directly result of Lemma 4.1 and 4.2, by As mentioned by Sønderby et al. (2016) and Molchanov et al. (2017), training sparse The optimization method used is Adam. The implementation details for UCI datasets and MNIST can be found in Section 2.5 and 2.6 In this section, we aim to demonstrate that there is little difference between the results using inverse-CDF reparameterization and Gumbel-softmax approximation via a toy example.


Maxitive Donsker-Varadhan Formulation for Possibilistic Variational Inference

Singh, Jasraj, Wongso, Shelvia, Houssineau, Jeremie, Chérief-Abdellatif, Badr-Eddine

arXiv.org Machine Learning

V ariational inference (VI) is a cornerstone of modern Bayesian learning, enabling approximate inference in complex models that would otherwise be intractable. However, its formulation depends on expectations and divergences defined through high-dimensional integrals, often rendering analytical treatment impossible and necessitating heavy reliance on approximate learning and inference techniques. Possibility theory, an imprecise probability framework, allows to directly model epistemic uncertainty instead of leveraging subjective probabilities. While this framework provides robustness and interpretability under sparse or imprecise information, adapting VI to the possibilistic setting requires rethinking core concepts such as entropy and divergence, which presuppose additivity. In this work, we develop a principled formulation of possibilistic variational inference and apply it to a special class of exponential-family functions, highlighting parallels with their probabilistic counterparts and revealing the distinctive mathematical structures of possibility theory.




Provable Gradient Variance Guarantees for Black-Box Variational Inference

Neural Information Processing Systems

Recent variational inference methods use stochastic gradient estimators whose variance is not well understood. Theoretical guarantees for these estimators are important to understand when these methods will or will not work.




main remarks regarding baseline, scalability, complexity and the full batch setting in the following paragraphs

Neural Information Processing Systems

We thank the reviewers for the valuable comments and suggestions made. The reviewers' main concern is the lack of RQVI procedure led to computational instability). GLM, BNN) and five datasets (Boston, Fires, Life Expect., Frisk and Metro) with learning rate analysis. We do not claim that this method is suitable for high dimensional posteriors. It is accurate that the method will not be viable without this property.