On the Surprising Effectiveness of Attention Transfer for Vision Transformers

Mar-22-2026, 13:56:22 GMT–Neural Information Processing Systems

Conventional wisdom suggests that pre-training Vision Transformers (ViT) improves downstream performance by learning useful representations.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Mar-22-2026, 13:56:22 GMT

Conferences Web Page

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.50)