FEANEL: A Benchmark for Fine-Grained Error Analysis in K-12 English Writing
Ye, Jingheng, Wang, Shen, Chen, Jiaqi, Wang, Hebin, Zou, Deqing, Zhu, Yanyu, Tang, Jiwei, Zheng, Hai-Tao, Liu, Ruitong, Li, Haoyang, Wang, Yanfeng, Wen, Qingsong
–arXiv.org Artificial Intelligence
Large Language Models (LLMs) have transformed artificial intelligence, offering profound opportunities for educational applications. However, their ability to provide fine-grained educational feedback for K-12 English writing remains underexplored. In this paper, we challenge the error analysis and pedagogical skills of LLMs by introducing the problem of Fine-grained Error Analysis for English Learners and present the Fine-grained Error ANalysis for English Learners (FEANEL) Benchmark. The benchmark comprises 1,000 essays written by elementary and secondary school students, and a well-developed English writing error taxonomy. Each error is annotated by language education experts and categorized by type, severity, and explanatory feedback, using a part-of-speech-based taxonomy they co-developed. We evaluate state-of-the-art LLMs on the FEANEL Benchmark to explore their error analysis and pedagogical abilities. Experimental results reveal significant gaps in current LLMs' ability to perform fine-grained error analysis, highlighting the need for advancements in particular methods for educational applications.
arXiv.org Artificial Intelligence
Dec-1-2025
- Country:
- Asia
- China
- Middle East > Republic of Türkiye
- Mersin Province > Mersin (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Europe > Austria
- Vienna (0.14)
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Texas > Travis County
- Austin (0.04)
- Florida > Miami-Dade County
- Mexico > Mexico City
- South America > Uruguay
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Technology: