ExpandNets: LinearOver-parameterization toTrainCompactConvolutionalNetworks-SupplementaryMaterial-AComplementaryExperiments
–Neural Information Processing Systems
However,withdeep networks, initialization can have an important effect on the final results. While designing an initialization strategy specifically for compact networks is an unexplored research direction, our ExpandNets can be initialized in a natural manner. Note that this strategy yields an additional accuracy boost to our approach. Theoutput ofthelastlayer ispassed through afully-connected layer with 64 units, followed by a logit layer with either 10 or 100 units. Weusedstandard stochastic gradient descent (SGD) withamomentum of0.9 and a learning rate of0.01, divided by10 at epochs 50 and 100.
Neural Information Processing Systems
Feb-7-2026, 11:38:48 GMT
- Country:
- Technology: