Reliability Testing for Natural Language Processing Systems

Tan, Samson, Joty, Shafiq, Baxter, Kathy, Taeihagh, Araz, Bennett, Gregory A., Kan, Min-Yen

May-13-2021–arXiv.org Artificial Intelligence

Questions of fairness, robustness, and transparency are paramount to address before deploying NLP systems. Central to these concerns is the question of reliability: Can NLP systems reliably treat different demographics fairly and function correctly in diverse and noisy environments? To address this, we argue for the need for reliability testing and contextualize it among existing work on improving accountability. We show how adversarial attacks can be reframed for this goal, via a framework for developing reliability tests. We argue that Figure 1: How DOCTOR can integrate with existing reliability testing -- with an emphasis on interdisciplinary system development workflows. Test (left) and system collaboration -- will enable rigorous development (right) take place in parallel, separate and targeted testing, and aid in the enactment teams. Reliability tests can thus be constructed independent and enforcement of industry standards. of the system development team, either by an internal "red team" or by independent auditors.

european commission, institute of electrical and electronics engineers (ieee), salesforce.com, inc., (33 more...)

arXiv.org Artificial Intelligence

May-13-2021

arXiv.org PDF

Add feedback

Country:
- Asia (1.00)
- Europe (1.00)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Overview (0.93)
- Research Report (1.00)

Industry:
- Education > Assessment & Standards
  - Student Performance (0.68)
- Government > Regional Government
  - North America Government > United States Government (0.93)
- Health & Medicine (0.93)
- Information Technology > Security & Privacy (1.00)
- Law (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Issues > Social & Ethical Issues (1.00)
  - Natural Language > Machine Translation (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found