iterative randomization
Appendices for " Pruning Randomly Initialized Neural Networks with Iterative Randomization " Contents
We consider a target neural networkf: Rd0 Rdl of depth l, which is described as follows. Similar to the previous works [6, 7], we assume that g(x) is twice as deep as the target network f(x). Thus, g(x) can be described as g(x)=G2lσ(G2l 1σ( G1(x))), (2) where Gj is a edj edj 1 matrix (edj N 1 for j = 1,,2l) with ed2i = di. Under this re-sampling assumption, we describe our main theorem as follows. 1 Theorem A.1 (Main Theorem) Fix,δ>0, and we assume thatkFikFrob 1. LetR Nand we assumethat each elementof Gi can be re-sampled with replacementfrom the uniformdistribution U[ 1,1] up to R 1 times. If n 2log(1δ) holds, then with probability at least 1 δ, we have |α Xi|, (5) for some i {1,,n}.
Pruning Randomly Initialized Neural Networks with Iterative Randomization
Pruning the weights of randomly initialized neural networks plays an important role in the context of lottery ticket hypothesis. Ramanujan et al. (2020) empirically showed that only pruning the weights can achieve remarkable performance instead of optimizing the weight values. However, to achieve the same level of performance as the weight optimization, the pruning approach requires more parameters in the networks before pruning and thus more memory space. To overcome this parameter inefficiency, we introduce a novel framework to prune randomly initialized neural networks with iteratively randomizing weight values (IteRand). Theoretically, we prove an approximation theorem in our framework, which indicates that the randomizing operations are provably effective to reduce the required number of the parameters.
Pruning Randomly Initialized Neural Networks with Iterative Randomization
Chijiwa, Daiki, Yamaguchi, Shin'ya, Ida, Yasutoshi, Umakoshi, Kenji, Inoue, Tomohiro
Pruning the weights of randomly initialized neural networks plays an important role in the context of lottery ticket hypothesis. Ramanujan et al. (2020) empirically showed that only pruning the weights can achieve remarkable performance instead of optimizing the weight values. However, to achieve the same level of performance as the weight optimization, the pruning approach requires more parameters in the networks before pruning and thus more memory space. To overcome this parameter inefficiency, we introduce a novel framework to prune randomly initialized neural networks with iteratively randomizing weight values (IteRand). Theoretically, we prove an approximation theorem in our framework, which indicates that the randomizing operations are provably effective to reduce the required number of the parameters. We also empirically demonstrate the parameter efficiency in multiple experiments on CIFAR-10 and ImageNet.