HydraViT: Stacking Heads for a Scalable ViT

Neural Information Processing Systems 

The architecture of Vision Transformers (ViTs), particularly the Multi-head Attention (MHA) mechanism, imposes substantial hardware demands.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found