A Novel Framework for Augmenting Rating Scale Tests with LLM-Scored Text Data

Watson, Joe, O'Connor, Ivan, Chen, Chia-Wen, Sun, Luning, Luo, Fang, Stillwell, David

arXiv.org Artificial Intelligence 

Psychological assessments are dominated by rating scales, which cannot capture the nuance in natural language. Efforts to supplement them with qualitative text have relied on labelled datasets or expert rubrics, limiting scalability. We introduce a framewo rk that avoids this reliance: large language models (LLMs) score free - text responses with simple prompts to produce candidate LLM items, from which we retain those that yield the most test information when co - calibrated with a baseline scale. Using depress ion as a case study, we developed and tested the method in upper - secondary students (n=693) and a matched synthetic dataset (n=3,000). Results on held - out test sets show ed that augmenting a 19 - item scale with LLM items improved its precision, accuracy, and convergent validity. Further, the test information gain matched that of adding as many as 16 rating - scale items. This framework leverage s the increas ing availability of transcribed language to enhance psychometric measures, with applications in clinical h ealth and beyond.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found