What Makes Pre-trained Language Models Better Zero-shot Learners?