Language Models are Few-Shot Learners
–Neural Information Processing Systems
Specifically, we train GPT -3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting.
Neural Information Processing Systems
Oct-2-2025, 04:31:00 GMT