Supplemental Material: Efficient Neural Network Training via Forward and Backward Propagation Sparsification
–Neural Information Processing Systems
This appendix can be divided into four parts. Section A gives the detailed proof of Theorem 1 and discuss the convergence of our method. Before giving the detailed proof, we would like to present the following two properties of overparam-eterized deep neural networks, which are implied by the latest studies based on the mean field theory. We will empirically verify these properties in this section and adopt them as assumptions in our proof. That's why Property 1 holds.
Neural Information Processing Systems
Aug-15-2025, 12:02:05 GMT