Improving BERT with Hybrid Pooling Network and Drop Mask