ff1418e8cc993fe8abcfe3ce2003e5c5-Supplemental.pdf

Neural Information Processing Systems 

The table ( right) shows 100 epoch results using best lr and wd values found at 50 epochs. ViT's patchify stem differs from the proposed convolutional stem in the type of convolution used and We investigate these factors next. The focus of this paper is studying the large, positive impact of changing ViT's default We use AdamW for all experiments. Figure 7 shows the results. The table ( right) shows 100 epoch results using optimal lr and wd values chosen from the 50 epoch runs.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found