Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE