Twins: Revisiting the Design of Spatial Attention in Vision Transformers