Meta-tuning Language Models to Answer Prompts Better

Zhong, Ruiqi, Lee, Kristy, Zhang, Zheng, Klein, Dan

Apr-16-2021–arXiv.org Artificial Intelligence

Large pretrained language models like GPT-3 have acquired a surprising ability to perform zero-shot classification (ZSC). For example, to classify review sentiments, we can "prompt" the language model with the review and the question "Is the review positive?" as the context, and ask it to predict whether the next word is "Yes" or "No". However, these models are not specialized for answering these prompts. To address this weakness, we propose meta-tuning, which trains the model to specialize in answering prompts but still generalize to unseen tasks. To create the training data, we aggregated 43 existing datasets, annotated 441 label descriptions in total, and unified them into the above question answering (QA) format. After meta-tuning, our model outperforms a same-sized QA model for most labels on unseen tasks, and we forecast that the performance would improve for even larger models. Therefore, measuring ZSC performance on non-specialized language models might underestimate their true capability, and community-wide efforts on aggregating datasets and unifying their formats can help build models that understand prompts better.

classification, dataset, label description, (16 more...)

arXiv.org Artificial Intelligence

Apr-16-2021

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- North America
  - Mexico (0.04)
  - United States
    - Oregon (0.04)
    - Nevada (0.04)
    - Alabama (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California
      - San Diego County > San Diego (0.04)
      - Alameda County > Berkeley (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Russia (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Tuscany
    - Florence (0.04)
- Asia
  - Russia (0.04)
  - Middle East > Qatar (0.04)
  - China > Hong Kong (0.04)
- Africa
  - Zimbabwe (0.04)
  - Namibia (0.04)
  - Madagascar (0.04)
  - Kenya (0.04)
  - Eswatini (0.04)
  - Botswana (0.04)

Genre:
- Research Report (0.82)

Industry:
- Leisure & Entertainment (1.00)
- Media > Film (0.94)
- Health & Medicine > Therapeutic Area (0.93)
- Law (0.67)
- Government > Regional Government
  - North America Government > United States Government (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Text Classification (0.94)
    - Large Language Model (0.87)
  - Machine Learning > Neural Networks
    - Deep Learning (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found