On the Surprising Effectiveness of Attention Transfer for Vision Transformers
–Neural Information Processing Systems
Conventional wisdom suggests that pre-training Vision Transformers (ViT) improves downstream performance by learning useful representations.
Neural Information Processing Systems
Dec-27-2025, 08:08:48 GMT
- Technology: