VanillaNet: the Power of Minimalism in Deep Learning (Supplementary Material)
–Neural Information Processing Systems
The detailed architecture for VanillaNet with 7-13 layers can be found in Table 1, where each convolutional layer is followed with an activation function. For the VanillaNet-13-1.5, the number of channels are multiplied with 1.5. For classification on ImageNet, we train the VanillaNets for 300 epochs utilizing the cosine learning rate decay [5]. The λis linearly decayed from 1 to 0 on epoch 0 and 100, respectively. The training details can be fould in Table 2.
Neural Information Processing Systems
Apr-25-2026, 06:34:10 GMT
- Technology: