Linearly Decomposing and Recomposing Vision Transformers for Diverse-Scale Models Shuxia Lin

Neural Information Processing Systems 

Vision Transformers (ViTs) are widely used in a variety of applications, while they usually have a fixed architecture that may not match the varying computational resources of different deployment environments.