QAConv: Question Answering on Informative Conversations

Wu, Chien-Sheng, Madotto, Andrea, Liu, Wenhao, Fung, Pascale, Xiong, Caiming

May-14-2021–arXiv.org Artificial Intelligence

This paper introduces QAConv, a new question answering (QA) dataset that uses conversations as a knowledge source. We focus on informative conversations including business emails, panel discussions, and work channels. Unlike open-domain and task-oriented dialogues, these conversations are usually long, complex, asynchronous, and involve strong domain knowledge. In total, we collect 34,204 QA pairs, including span-based, free-form, and unanswerable questions, from 10,259 selected conversations with both human-written and machine-generated questions. We segment long conversations into chunks, and use a question generator and dialogue summarizer as auxiliary tools to collect multi-hop questions. The dataset has two testing scenarios, chunk mode and full mode, depending on whether the grounded chunk is provided or retrieved from a large conversational pool. Experimental results show that state-of-the-art QA systems trained on existing QA datasets have limited zero-shot ability and tend to predict our questions as unanswerable. Fine-tuning such systems on our corpus can achieve significant improvement up to 23.6% and 13.6% in both chunk mode and full mode, respectively.

artificial intelligence, natural language, question answering, (16 more...)

arXiv.org Artificial Intelligence

May-14-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)
- Asia (1.00)
- Europe > Finland
  - Uusimaa (0.15)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Information Technology (0.68)
- Media (0.68)
- Government
  - Voting & Elections (0.46)
  - Regional Government > North America Government
    - United States Government (0.68)

Technology:
- Information Technology > Artificial Intelligence > Natural Language
  - Question Answering (0.72)
  - Text Processing (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found