What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing

Yang, Chenyang, Hong, Yining, Lewis, Grace A., Wu, Tongshuang, Kästner, Christian

Sep-13-2024–arXiv.org Artificial Intelligence

Machine learning models make mistakes, yet sometimes it is difficult to identify the systematic problems behind the mistakes. Practitioners engage in various activities, including error analysis, testing, auditing, and red-teaming, to form hypotheses of what can go (or has gone) wrong with their models. To validate these hypotheses, practitioners employ data slicing to identify relevant examples. However, traditional data slicing is limited by available features and programmatic slicing functions. In this work, we propose SemSlicer, a framework that supports semantic data slicing, which identifies a semantically coherent slice, without the need for existing features. SemSlicer uses Large Language Models to annotate datasets and generate slices from any user-defined slicing criteria. We show that SemSlicer generates accurate slices with low cost, allows flexible trade-offs between different design dimensions, reliably identifies under-performing data slices, and helps practitioners identify useful data slices that reflect systematic problems.

computational linguistic, instruction, semslicer, (15 more...)

arXiv.org Artificial Intelligence

Sep-13-2024

arXiv.org PDF

Add feedback

Country:
- Asia
  - Indonesia > Bali (0.04)
  - Middle East
    - Jordan (0.04)
    - Republic of Türkiye (0.04)
    - Saudi Arabia (0.04)
  - Singapore (0.05)
- Europe
  - Czechia > Prague (0.04)
  - Estonia > Harju County
    - Tallinn (0.04)
  - Germany > Hamburg (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - United States
    - California
      - Sacramento County > Sacramento (0.05)
      - San Francisco County > San Francisco (0.04)
    - New York > New York County
      - New York City (0.04)
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.04)
- South America > Colombia
  - Meta Department > Villavicencio (0.04)

Genre:
- Research Report (1.00)

Industry:
- Education (0.46)
- Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning (1.00)