GENTLE: A Genre-Diverse Multilayer Challenge Set for English NLP and Linguistic Evaluation

Aoyama, Tatsuya, Behzad, Shabnam, Gessler, Luke, Levine, Lauren, Lin, Jessica, Liu, Yang Janet, Peng, Siyao, Zhu, Yilun, Zeldes, Amir

Sep-21-2023–arXiv.org Artificial Intelligence

We present GENTLE, a new mixed-genre English challenge corpus totaling 17K tokens and consisting of 8 unusual text types for out-of domain evaluation: dictionary entries, esports commentaries, legal documents, medical notes, poetry, mathematical proofs, syllabuses, and threat letters. GENTLE is manually annotated for a variety of popular NLP tasks, including syntactic dependency parsing, entity recognition, coreference resolution, and discourse parsing. We evaluate state-of-the-art NLP systems on GENTLE and find severe degradation for at least some genres in their performance on all tasks, which indicates GENTLE's utility as an evaluation dataset for NLP systems.

annotation, computational linguistic, proceedings, (13 more...)

arXiv.org Artificial Intelligence

Sep-21-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Singapore (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - Dominican Republic (0.04)
  - Canada (0.04)
  - United States
    - New Mexico (0.04)
    - Maryland > Baltimore (0.04)
    - California > San Diego County
      - San Diego (0.04)
- Europe
  - Czechia > Prague (0.04)
  - Germany > Berlin (0.04)
  - Iceland > Capital Region
    - Reykjavik (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)

Genre:
- Instructional Material (0.67)
- Research Report (0.64)

Industry:
- Law (1.00)
- Leisure & Entertainment > Sports (0.36)

Technology:
- Information Technology > Artificial Intelligence > Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found