LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing

Wang, Zhengxiang, Makarova, Veronika, Li, Zhi, Kodner, Jordan, Rambow, Owen

Feb-16-2025–arXiv.org Artificial Intelligence

The paper explores the performance of LLMs in the context of multi-dimensional analytic writing assessments, i.e. their ability to provide both scores and comments based on multiple assessment criteria. Using a corpus of literature reviews written by L2 graduate students and assessed by human experts against 9 analytic criteria, we prompt several popular LLMs to perform the same task under various conditions. To evaluate the quality of feedback comments, we apply a novel feedback comment quality evaluation framework. This framework is interpretable, cost-efficient, scalable, and reproducible, compared to existing methods that rely on manual judgments. We find that LLMs can generate reasonably good and generally reliable multi-dimensional analytic assessments. We release our corpus for reproducibility.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

Feb-16-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)
- Asia (0.67)
- Europe > United Kingdom
  - England (0.28)

Genre:
- Overview (1.00)
- Instructional Material (0.93)
- Research Report
  - New Finding (1.00)
  - Experimental Study (0.67)

Industry:
- Health & Medicine > Therapeutic Area
  - Psychiatry/Psychology (0.46)
- Education
  - Educational Technology > Educational Software (0.68)
  - Educational Setting > Online (0.68)
  - Assessment & Standards > Student Performance (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found