What Makes Pre-trained Language Models Better Zero-shot Learners?

Open in new window