Evaluating the Robustness of Machine Reading Comprehension Models to Low Resource Entity Renaming
Siro, Clemencia, Ajayi, Tunde Oluwaseyi
–arXiv.org Artificial Intelligence
Question answering (QA) models have shown compelling results in the task of Machine Reading Comprehension (MRC). Recently these systems have proved to perform better than humans on held-out test sets of datasets e.g. SQuAD, but their robustness is not guaranteed. The QA model's brittleness is exposed when evaluated on adversarial generated examples by a performance drop. In this study, we explore the robustness of MRC models to entity renaming, with entities from low-resource regions such as Africa. We propose EntSwap, a method for test-time perturbations, to create a test set whose entities have been renamed. In particular, we rename entities of type: country, person, nationality, location, organization, and city, to create AfriSQuAD2. Using the perturbed test set, we evaluate the robustness of three popular MRC models. We find that compared to base models, large models perform well comparatively on novel entities. Furthermore, our analysis indicates that entity type person highly challenges the MRC models' performance.
arXiv.org Artificial Intelligence
Apr-6-2023
- Country:
- Africa
- Kenya (0.04)
- Middle East > Egypt (0.04)
- Asia
- China > Jiangsu Province
- Nanjing (0.04)
- India (0.04)
- Japan (0.04)
- Middle East > Israel (0.04)
- Russia (0.04)
- China > Jiangsu Province
- Europe
- Poland > Masovia Province
- Warsaw (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- United Kingdom (0.04)
- Ireland > Connaught
- County Galway > Galway (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Greece (0.04)
- Russia (0.04)
- France (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Germany (0.04)
- Italy > Tuscany
- Florence (0.04)
- Austria (0.04)
- Netherlands > North Holland
- Amsterdam (0.05)
- Poland > Masovia Province
- North America
- Canada > British Columbia
- United States
- New York > New York County
- New York City (0.04)
- California
- Los Angeles County
- Long Beach (0.04)
- Los Angeles (0.04)
- San Diego County > San Diego (0.04)
- Los Angeles County
- Washington > King County
- Seattle (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Oklahoma > Oklahoma County
- Oklahoma City (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Nevada (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Texas > Travis County
- Austin (0.04)
- New York > New York County
- Oceania > Australia
- Africa
- Genre:
- Research Report > New Finding (0.34)
- Industry:
- Education > Assessment & Standards > Student Performance (0.61)
- Technology: