PQuAD: A Persian Question Answering Dataset
Darvishi, Kasra, Shahbodagh, Newsha, Abbasiantaeb, Zahra, Momtazi, Saeedeh
–arXiv.org Artificial Intelligence
It includes 80,000 questions along with their answers, with 25% of the questions being adversarially unanswerable. We examine various properties of the dataset to show the diversity and the level of its difficulty as a MRC benchmark. By releasing this dataset, we aim to ease research on Persian reading comprehension and development of persian question answering systems. Our experiments on different state-of-the-art pre-trained contextualized language models shows 74.8% Exact Match (EM) and 87.6% F1-score that can be used as the baseline results for further research on Persian QA.
arXiv.org Artificial Intelligence
Feb-13-2022
- Country:
- Asia > Middle East
- Iran > Tehran Province > Tehran (0.04)
- Caspian Sea (0.04)
- Europe > France
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- North America
- Canada > British Columbia
- United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Texas > Travis County
- Austin (0.04)
- Washington > King County
- Seattle (0.04)
- Minnesota > Hennepin County
- Asia > Middle East
- Genre:
- Research Report (0.50)
- Industry:
- Education (0.37)
- Leisure & Entertainment (0.46)
- Media (0.46)
- Technology: