kNN-Prompt: Nearest Neighbor Zero-Shot Inference
Shi, Weijia, Michael, Julian, Gururangan, Suchin, Zettlemoyer, Luke
–arXiv.org Artificial Intelligence
Retrieval-augmented language models (LMs) use non-parametric memory to substantially outperform their non-retrieval counterparts on perplexity-based evaluations, but it is an open question whether they achieve similar gains in few- and zero-shot end-task accuracy. We extensively study one such model, the k-nearest neighbor LM (kNN-LM), showing that the gains marginally transfer. The main challenge is to achieve coverage of the verbalizer tokens that define the different end-task class labels. To address this challenge, we also introduce kNN-Prompt, a simple and effective kNN-LM with automatically expanded fuzzy verbalizers (e.g. to expand terrible to also include silly and other task-specific synonyms for sentiment classification). Across nine diverse end-tasks, using kNN-Prompt with GPT-2 large yields significant performance boosts over strong zero-shot baselines (13.4% absolute improvement over the base LM on average). We also show that other advantages of non-parametric augmentation hold for end tasks; kNN-Prompt is effective for domain adaptation with no further training, and gains increase with the size of the retrieval model.
arXiv.org Artificial Intelligence
Nov-1-2022
- Country:
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Seattle (0.14)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Washington > King County
- Europe > Belgium
- Brussels-Capital Region > Brussels (0.04)
- Asia
- North America
- Genre:
- Research Report (0.82)
- Industry:
- Leisure & Entertainment (1.00)
- Media (0.68)
- Technology: