The power of Convolution in Vision Transformer