Cross-Lingual Open-Domain Question Answering with Answer Sentence Generation
Muller, Benjamin, Soldaini, Luca, Koncel-Kedziorski, Rik, Lind, Eric, Moschitti, Alessandro
–arXiv.org Artificial Intelligence
Open-Domain Generative Question Answering has achieved impressive performance in English by combining document-level retrieval with answer generation. These approaches, which we refer to as GenQA, can generate complete sentences, effectively answering both factoid and non-factoid questions. In this paper, we extend GenQA to the multilingual and cross-lingual settings. For this purpose, we first introduce GenTyDiQA, an extension of the TyDiQA dataset with well-formed and complete answers for Arabic, Bengali, English, Japanese, and Russian. Based on GenTyDiQA, we design a cross-lingual generative model that produces full-sentence answers by exploiting passages written in multiple languages, including languages different from the question. Our cross-lingual generative system outperforms answer sentence selection baselines for all 5 languages and monolingual generative pipelines for three out of five languages studied.
arXiv.org Artificial Intelligence
Dec-19-2022
- Country:
- Europe (1.00)
- North America > United States
- Minnesota (0.28)
- Genre:
- Research Report (0.82)
- Technology: