An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels
Sorensen, Taylor, Robinson, Joshua, Rytting, Christopher Michael, Shaw, Alexander Glenn, Rogers, Kyle Jeffrey, Delorey, Alexia Pauline, Khalil, Mahmoud, Fulda, Nancy, Wingate, David
–arXiv.org Artificial Intelligence
Pre-trained language models derive substantial linguistic and factual knowledge from the massive corpora on which they are trained, and prompt engineering seeks to align these models to specific tasks. Unfortunately, existing prompt engineering methods require significant amounts of labeled data, access to model parameters, or both. We introduce a new method for selecting prompt templates without labeled examples and without direct access to the model. Specifically, over a set of candidate templates, we choose the template that maximizes the mutual information between the input and the corresponding model output. Figure 1: Performance of template selected by our maximum Across 8 datasets representing 7 distinct NLP mutual information method (MI) compared to tasks, we show that when a template has high the the worst, mean, median, and best prompt on GPT-3 mutual information, it also has high accuracy Davinci (175B). Our method performs at almost oracle on the task. On the largest model, selecting levels, without labels or access to model weights.
arXiv.org Artificial Intelligence
Mar-21-2022
- Country:
- Europe > France (0.04)
- North America
- Canada
- British Columbia (0.04)
- Prince Edward Island (0.04)
- Quebec > Montreal (0.04)
- Mexico (0.04)
- United States
- Canada
- South America (0.04)
- Genre:
- Research Report (0.63)
- Industry:
- Leisure & Entertainment (1.00)
- Media > Film (1.00)
- Technology: