ECViT: Efficient Convolutional Vision Transformer with Local-Attention and Multi-scale Stages

Open in new window