IA-RED 2: Interpretability-A ware Redundancy Reduction for Vision Transformers (Supplementary Material) Bowen Pan

Neural Information Processing Systems 

Algorithm 1 Optimize multi-head interpreters and MSA-FFN blocks on DeiT -S.Require: A token sequence X right after the positional embedding and its label Y .