Elaboration-Generating Commonsense Question Answering at Scale

Wang, Wenya, Srikumar, Vivek, Hajishirzi, Hanna, Smith, Noah A.

Jul-14-2023–arXiv.org Artificial Intelligence

In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working with such models is very high; in this work, we finetune smaller language models to generate useful intermediate context, referred to here as elaborations. Our framework alternates between updating two language models -- an elaboration generator and an answer predictor -- allowing each to influence the other. Using less than 0.5% of the parameters of GPT-3, our model outperforms alternatives with similar sizes and closes the gap on GPT-3 on four commonsense question answering benchmarks. Human evaluations show that the quality of the generated elaborations is high.

machine learning, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

Jul-14-2023

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)
- North America > United States (0.14)
- Asia > China (0.14)

Genre:
- Research Report (0.50)

Industry:
- Health & Medicine > Therapeutic Area (0.68)
- Energy > Oil & Gas
  - Upstream (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.94)
    - Question Answering (0.92)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found