On the Surprising Effectiveness of Attention Transfer for Vision Transformers Alexander C. Li

Neural Information Processing Systems 

Conventional wisdom suggests that pre-training Vision Transformers (ViT) improves downstream performance by learning useful representations.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found