Appendices for " Pruning Randomly Initialized Neural Networks with Iterative Randomization " Contents
–Neural Information Processing Systems
We consider a target neural networkf: Rd0 Rdl of depth l, which is described as follows. Similar to the previous works [6, 7], we assume that g(x) is twice as deep as the target network f(x). Thus, g(x) can be described as g(x)=G2lσ(G2l 1σ( G1(x))), (2) where Gj is a edj edj 1 matrix (edj N 1 for j = 1,,2l) with ed2i = di. Under this re-sampling assumption, we describe our main theorem as follows. 1 Theorem A.1 (Main Theorem) Fix,δ>0, and we assume thatkFikFrob 1. LetR Nand we assumethat each elementof Gi can be re-sampled with replacementfrom the uniformdistribution U[ 1,1] up to R 1 times. If n 2log(1δ) holds, then with probability at least 1 δ, we have |α Xi|, (5) for some i {1,,n}.
Neural Information Processing Systems
Feb-7-2026, 21:15:47 GMT