Revisiting the Integration of Convolution and Attention for Vision Backbone