PQuAD: A Persian Question Answering Dataset

Darvishi, Kasra, Shahbodagh, Newsha, Abbasiantaeb, Zahra, Momtazi, Saeedeh

Feb-13-2022–arXiv.org Artificial Intelligence

It includes 80,000 questions along with their answers, with 25% of the questions being adversarially unanswerable. We examine various properties of the dataset to show the diversity and the level of its difficulty as a MRC benchmark. By releasing this dataset, we aim to ease research on Persian reading comprehension and development of persian question answering systems. Our experiments on different state-of-the-art pre-trained contextualized language models shows 74.8% Exact Match (EM) and 87.6% F1-score that can be used as the baseline results for further research on Persian QA.

machine learning, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

Feb-13-2022

arXiv.org PDF

Add feedback

Country:
- Caspian Sea (0.04)
- North America
  - United States
    - Washington > King County
      - Seattle (0.04)
    - Texas > Travis County
      - Austin (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe > France
  - Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Asia > Middle East
  - Iran > Tehran Province > Tehran (0.04)

Genre:
- Research Report (0.50)

Industry:
- Media (0.46)
- Leisure & Entertainment (0.46)
- Education (0.37)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Question Answering (0.94)
  - Machine Learning > Neural Networks (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found