ToFe: Lagged Token Freezing and Reusing for Efficient Vision Transformer Inference
Zhang, Haoyue, Zhang, Jie, Guo, Song
–arXiv.org Artificial Intelligence
--Although vision transformers (ViT) have shown remarkable success in various vision tasks, their computationally expensive self-attention hinder their deployment on resource-constrained devices. T oken reduction, which discards less important tokens during forward propagation, has been proposed to enhance the efficiency of transformer models. However, existing methods handle unimportant tokens irreversibly, preventing their reuse in subsequent blocks. Considering that transformers focus on different information among blocks, tokens reduced in early blocks might be useful later . Furthermore, to adapt transformer models for resource-constrained devices, it is crucial to strike a balance between model performance and computational overhead. T o address these challenges, in this paper, we introduce a novel T oken Freezing and Reusing (T oFe) framework, where we identify important tokens at each stage and temporarily freeze the unimportant ones, allowing their lagged reusing at a later stage. Specifically, we design a prediction module for token identification and an approximate module for recovery of the frozen tokens. By jointly optimizing with the backbone through computation budget-aware end-to-end training, T oFe can adaptively process the necessary tokens at each block, thereby reducing computational cost while maintaining performance. Extensive experiments demonstrate that T oFe reduces the computational cost of L V-ViT model by 50% with less than 2% drop in T op-1 accuracy, achieving a better trade-off between performance and complexity compared to state-of-the-art methods. Large-scale pre-trained vision transformer (ViT) models [37] have achieved remarkable progress in the field of vision tasks.
arXiv.org Artificial Intelligence
Jul-23-2025
- Country:
- Asia > China
- Hong Kong (0.04)
- North America > Mexico
- Gulf of Mexico (0.14)
- Asia > China
- Genre:
- Research Report
- New Finding (0.46)
- Promising Solution (0.34)
- Research Report
- Technology: