Cross-Lingual Open-Domain Question Answering with Answer Sentence Generation
Muller, Benjamin, Soldaini, Luca, Koncel-Kedziorski, Rik, Lind, Eric, Moschitti, Alessandro
–arXiv.org Artificial Intelligence
Open-Domain Generative Question Answering has achieved impressive performance in English by combining document-level retrieval with answer generation. These approaches, which we refer to as GenQA, can generate complete sentences, effectively answering both factoid and non-factoid questions. In this paper, we extend GenQA to the multilingual and cross-lingual settings. For this purpose, we first introduce GenTyDiQA, an extension of the TyDiQA dataset with well-formed and complete answers for Arabic, Bengali, English, Japanese, and Russian. Based on GenTyDiQA, we design a cross-lingual generative model that produces full-sentence answers by exploiting passages written in multiple languages, including languages different from the question. Our cross-lingual generative system outperforms answer sentence selection baselines for all 5 languages and monolingual generative pipelines for three out of five languages studied.
arXiv.org Artificial Intelligence
Dec-19-2022
- Country:
- Oceania > Australia
- Queensland (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Texas (0.04)
- Pennsylvania (0.04)
- New Mexico (0.04)
- Louisiana (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Europe
- Czechia > Prague (0.04)
- United Kingdom > Scotland
- City of Aberdeen > Aberdeen (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Tuscany
- Florence (0.04)
- France > Île-de-France
- Denmark > Capital Region
- Copenhagen (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia > China
- Hong Kong (0.04)
- Oceania > Australia
- Genre:
- Research Report (0.82)
- Technology: