What Makes for Hierarchical Vision Transformer?