Supplementary Document to the Paper " Efficient V ariational Inference for Sparse Deep Learning with Theoretical Guarantee "

Dec-27-2025, 17:38:08 GMT–Neural Information Processing Systems

As a technical tool for the proof, we first restate the Lemma 6.1 in Chérief-Abdellatif and Alquier The first inequality is due to Lemma 1.1 and the second Under Condition 4.1 - 4.2, we have the following lemma that shows the existence of testing functions Now we define φ " max Note that log K " log N pε Hence we conclude the proof. We start with the first component. Pati et al. (2018), it could be shown ż 's, the third term in the RHS of (9) is bounded by 3 2nσ Similarly, the fifth term in the RHS of (9) is bounded by O p 1{n q. The convergence under squared Hellinger distance is directly result of Lemma 4.1 and 4.2, by As mentioned by Sønderby et al. (2016) and Molchanov et al. (2017), training sparse The optimization method used is Adam. The implementation details for UCI datasets and MNIST can be found in Section 2.5 and 2.6 In this section, we aim to demonstrate that there is little difference between the results using inverse-CDF reparameterization and Gumbel-softmax approximation via a toy example.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Dec-27-2025, 17:38:08 GMT

Conferences PDF

Add feedback

Country:
- Europe
  - Austria > Vienna (0.14)
  - France > Hauts-de-France
    - Nord > Lille (0.04)
  - Spain (0.04)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
- North America
  - Canada (0.04)
  - United States > Indiana
    - Tippecanoe County
      - Lafayette (0.04)
      - West Lafayette (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.50)
  - Statistical Learning (0.89)

Duplicate Docs Excel Report

Title
05a624166c8eb8273b8464e8d9cb5bd9-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found