Sparse Deep Learning: A New Framework Immune to Local Traps and Miscalibration
–Neural Information Processing Systems
We first define the equivalent class of neural network parameters. Remark on the notation: ν() is similar to ν() defined in Section 2.1 of the main text. In what follows, we will use ν(β) and ν(γ) to denote the connection weight and network structure of ν(β, γ), respectively. The proof of Theorem 2.2 can be done using the same strategy as that used in proving Theorem 2.1. Here we provide a simpler proof using the result of Theorem 2.1.
Neural Information Processing Systems
Mar-21-2025, 10:45:37 GMT