Large Language Model-Driven Dynamic Assessment of Grammatical Accuracy in English Language Learner Writing
Jaganov, Timur, Blake, John, Villegas, Julián, Carr, Nicholas
–arXiv.org Artificial Intelligence
This study investigates the potential for Large Language Models (LLMs) to scale-up Dynamic Assessment (DA). To facilitate such an investigation, we first developed DynaWrite-a modular, microservices-based grammatical tutoring application which supports multiple LLMs to generate dynamic feedback to learners of English. Initial testing of 21 LLMs, revealed GPT-4o and neural chat to have the most potential to scale-up DA in the language learning classroom. Further testing of these two candidates found both models performed similarly in their ability to accurately identify grammatical errors in user sentences. However, GPT-4o consistently outperformed neural chat in the quality of its DA by generating clear, consistent, and progressively explicit hints. Real-time responsiveness and system stability were also confirmed through detailed performance testing, with GPT-4o exhibiting sufficient speed and stability. This study shows that LLMs can be used to scale-up dynamic assessment and thus enable dynamic assessment to be delivered to larger groups than possible in traditional teacher-learner settings.
arXiv.org Artificial Intelligence
Sep-8-2025
- Country:
- Asia
- China > Hong Kong (0.04)
- Japan (0.05)
- Kazakhstan (0.04)
- Europe
- Estonia (0.04)
- United Kingdom > England
- West Midlands > Birmingham (0.04)
- Oceania > Australia
- Asia
- Genre:
- Instructional Material (1.00)
- Research Report > New Finding (1.00)
- Industry:
- Technology: