Multilingual Non-Factoid Question Answering with Silver Answers
Mishra, Ritwik, Vennam, Sreeram, Shah, Rajiv Ratn, Kumaraguru, Ponnurangam
–arXiv.org Artificial Intelligence
Most existing Question Answering Datasets (QuADs) primarily focus on factoid-based short-context Question Answering (QA) in high-resource languages. However, the scope of such datasets for low-resource languages remains limited, with only a few works centered on factoid-based QuADs and none on non-factoid QuADs. Therefore, this work presents MuNfQuAD, a multilingual QuAD with non-factoid questions. It utilizes interrogative sub-headings from BBC news articles as questions and the corresponding paragraphs as silver answers. The dataset comprises over 370K QA pairs across 38 languages, encompassing several low-resource languages, and stands as the largest multilingual QA dataset to date. Based on the manual annotations of 790 QA-pairs from MuNfQuAD (golden set), we observe that 98\% of questions can be answered using their corresponding silver answer. Our fine-tuned Answer Paragraph Selection (APS) model outperforms the baselines. The APS model attained an accuracy of 80\% and 72\%, as well as a macro F1 of 72\% and 66\%, on the MuNfQuAD testset and the golden set, respectively. Furthermore, the APS model effectively generalizes certain a language within the golden set, even after being fine-tuned on silver labels.
arXiv.org Artificial Intelligence
Aug-20-2024
- Country:
- South America (0.04)
- Africa > Middle East (0.04)
- North America
- Central America (0.04)
- United States
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- Europe
- United Kingdom > Scotland (0.04)
- Ukraine (0.04)
- Russia (0.04)
- Middle East (0.04)
- Greece > Ionian Islands
- Corfu (0.04)
- France > Occitanie
- Haute-Garonne > Toulouse (0.04)
- Asia
- India (0.05)
- Nepal (0.04)
- China (0.04)
- Russia (0.04)
- Middle East > Qatar (0.04)
- Bangladesh (0.04)
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Government > Military (0.46)
- Technology: