Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets-Supplementary Materials
–Neural Information Processing Systems
Code is modified from https://github.com/coeusguo/ceit Module): def __init__ (self, dim, num_heads =8): super (). All the models are pre-trained on ImageNet-1K [1] only and then fine-tuned on CIFAR-100 [2] datasets. Results are shown in Table 1. We cite the reported results from corresponding papers. When fine-tuning our DHVT, we use AdamW optimizer with cosine learning rate scheduler and 2 warm-up epochs, a batch size of 256, an initial learning rate of 0.0005, weight decay of 1e-8, and fine-tuning epochs of 100.
Neural Information Processing Systems
May-30-2025, 07:27:45 GMT
- Technology: