The Impacts of Unanswerable Questions on the Robustness of Machine Reading Comprehension Models

Tran, Son Quoc, Do, Phong Nguyen-Thuan, Le, Uyen, Kretchmar, Matt

arXiv.org Artificial Intelligence 

Pretrained language models have achieved super-human performances on many Machine Reading Comprehension (MRC) benchmarks. Nevertheless, their relative inability to defend against adversarial attacks has spurred skepticism about their natural language understanding. In this paper, we ask whether training with unanswerable questions in SQuAD 2.0 can help improve the robustness of MRC models against adversarial attacks. To explore that question, we fine-tune three state-of-theart language models on either SQuAD 1.1 or SQuAD 2.0 and then evaluate their robustness under adversarial attacks. Our experiments reveal Figure 1: Example of predictions to an answerable that current models fine-tuned on SQuAD question of RoBERTa fine-tuned on SQuAD 1.1 (Rajpurkar 2.0 do not initially appear to be any more robust et al., 2016) (v1) versus its counterpart finetuned than ones fine-tuned on SQuAD 1.1, yet on SQuAD 2.0 (Rajpurkar et al., 2018) (v2) under they reveal a measure of hidden robustness that adversarial attack. While RoBERTa v1 predicts can be leveraged to realize actual performance "DartFord" as the answer under attack, RoBERTa v2 gains. Furthermore, we find that the robustness knows that "DartFord" is not the correct answer but of models fine-tuned on SQuAD 2.0 extends fails to focus back on "Nevada", the correct answer to additional out-of-domain datasets. Finally, for the given question. RoBERTa v2 then predicts the we introduce a new adversarial attack tested question as unanswerable.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found