CLEME2.0: Towards More Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction

Ye, Jingheng, Xu, Zishan, Li, Yinghui, Cheng, Xuxin, Song, Linlin, Zhou, Qingyu, Zheng, Hai-Tao, Shen, Ying, Su, Xin

Jun-30-2024–arXiv.org Artificial Intelligence

The paper focuses on improving the interpretability of Grammatical Error Correction (GEC) metrics, which receives little attention in previous studies. To bridge the gap, we propose CLEME2.0, a reference-based evaluation strategy that can describe four elementary dimensions of GEC systems, namely hit-correction, error-correction, under-correction, and over-correction. They collectively contribute to revealing the critical characteristics and locating drawbacks of GEC systems. Evaluating systems by Combining these dimensions leads to high human consistency over other reference-based and reference-less metrics. Extensive experiments on 2 human judgement datasets and 6 reference datasets demonstrate the effectiveness and robustness of our method. All the codes will be released after the peer review.

data quality, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jun-30-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - UAE (0.14)
- Europe > Austria
  - Vienna (0.14)
- North America
  - Canada (0.46)
  - United States > Texas (0.14)

Genre:
- Research Report (0.82)

Technology:
- Information Technology
  - Artificial Intelligence > Natural Language
    - Grammars & Parsing (0.72)
    - Large Language Model (0.71)
    - Text Processing (0.67)
  - Data Science > Data Quality
    - Data Cleaning (0.84)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found