Vision Transformer Neural Architecture Search for Out-of-Distribution Generalization: Benchmark and Insights

Neural Information Processing Systems 

While Vision Transformer (ViT) have achieved success across various machine learning tasks, deploying them in real-world scenarios faces a critical challenge: generalizing under Out-of-Distribution (OoD) shifts. A crucial research gap remains in understanding how to design ViT architectures - both manually and automatically - to excel in OoD generalization.