Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering

Dec-23-2023–arXiv.org Artificial Intelligence

We train a language model (LM) to robustly answer multistep questions by generating and answering sub-questions. We propose Chain-of-Questions, a framework that trains a model to generate sub-questions and sub-answers one at a time by leveraging human annotated question decomposition meaning representation (QDMR). The key technical challenge is that QDMR only contains sub-questions but not answers to those sub-questions, so we treat sub-answers as latent variables and optimize them using a novel dynamic mixture of Hard-EM and MAPO. Chain-of-Questions greatly outperforms strong neuro-symbolic methods by 9.0 F1 on DROP contrast set, and outperforms GPT-3.5 by 24.3 F1 on HOTPOTQA adversarial set, thus demonstrating the effectiveness and robustness of our framework.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Dec-23-2023

arXiv.org PDF

Add feedback

Country:
- Europe (0.68)
- North America > United States
  - California > Los Angeles County > Los Angeles (0.14)

Genre:
- Research Report (0.64)

Industry:
- Government (0.47)
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.49)
  - Natural Language
    - Large Language Model (1.00)
    - Question Answering (0.83)