Exploiting Information Redundancy in Attention Maps for Extreme Quantization of Vision Transformers

Open in new window