Vision Transformer Neural Architecture Search for Out-of-Distribution Generalization: Benchmark and Insights

May-27-2025, 10:02:56 GMT–Neural Information Processing Systems

While Vision Transformer (ViT) have achieved success across various machine learning tasks, deploying them in real-world scenarios faces a critical challenge: generalizing under Out-of-Distribution (OoD) shifts. A crucial research gap remains in understanding how to design ViT architectures – both manually and automatically – to excel in OoD generalization. To address this gap, we introduce OoD-ViT-NAS, the first systematic benchmark for ViT Neural Architecture Search (NAS) focused on OoD generalization. This comprehensive benchmark includes 3,000 ViT architectures of varying model computational budgets evaluated on common large-scale OoD datasets. With this comprehensive benchmark at hand, we analyze the factors that contribute to the OoD generalization of ViT architecture. Firstly, we show that ViT architecture designs have a considerable impact on OoD generalization.

accuracy, generalization, ood generalization, (9 more...)

Neural Information Processing Systems

May-27-2025, 10:02:56 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Systems & Languages > Problem-Independent Architectures (0.63)
  - Machine Learning > Neural Networks (0.63)