SWAN: A Generic Framework for Auditing Textual Conversational Systems

May-14-2023–arXiv.org Artificial Intelligence

We argue that such frameworks should satisfy the following requirements at least. Alertness They should detect potential problems with extremely high recall (i.e., near-zero misses), while appropriately crediting the benefits of the conversational systems. Moreover, when aiming for high recall, different people involved (i.e., not just users, but also workers who label data for training the system, etc.) should be taken into account; in particular, if the evaluation framework ignores some negative impacts on marginalised people, it does not satisfy the alertness requirement. Specificity By this we mean that the evaluation framework should be specific when locating the problem(s) within conversations. For example, an evaluation result that says"There is a problem somewhere inside this conversation session" is less useful than one that says"There is a problem in this particular system turn," which in turn is less useful than one that says "There is a problem in this particular claim within this system turn."

artificial intelligence, chatbot, natural language, (18 more...)

arXiv.org Artificial Intelligence

May-14-2023

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
  - Queensland (0.04)
- North America
  - Canada (0.04)
  - United States > Texas
    - Travis County > Austin (0.04)
- Europe
  - Slovenia (0.04)
  - Czechia > Prague (0.04)
  - United Kingdom > Scotland
    - City of Glasgow > Glasgow (0.04)
  - Spain > Galicia
    - Madrid (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.05)
- Asia
  - Taiwan > Taiwan Province
    - Taipei (0.04)
  - Singapore > Central Region
    - Singapore (0.04)
  - Japan > Honshū
    - Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
  - China > Tianjin Province
    - Tianjin (0.04)

Genre:
- Research Report (0.41)

Technology:
- Information Technology > Artificial Intelligence > Natural Language
  - Chatbot (0.88)
  - Discourse & Dialogue (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found