Dissecting Query-Key Interaction in Vision Transformers