VanillaNet: the Power of Minimalism in Deep Learning (Supplementary Material)

Neural Information Processing Systems 

The detailed architecture for VanillaNet with 7-13 layers can be found in Table 1, where each convolutional layer is followed with an activation function. For the VanillaNet-13-1.5, the number of channels are multiplied with 1.5. For classification on ImageNet, we train the VanillaNets for 300 epochs utilizing the cosine learning rate decay [5]. The λis linearly decayed from 1 to 0 on epoch 0 and 100, respectively. The training details can be fould in Table 2.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found