ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation