Exploring LLM Autoscoring Reliability in Large-Scale Writing Assessments Using Generalizability Theory