Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers