Prompting Strategies for Language Model-Based Item Generation in K-12 Education: Bridging the Gap Between Small and Large Language Models
Amini, Mohammad, Ahmadi, Babak, Xiong, Xiaomeng, Zhang, Yilin, Qiao, Christopher
–arXiv.org Artificial Intelligence
This study explores automatic generation (AIG) using language models to create multiple choice questions (MCQs) for morphological assessment, aiming to reduce the cost and inconsistency of manual test development. The study used a two-fold approach. First, we compared a fine-tuned medium model (Gemma, 2B) with a larger untuned one (GPT-3.5, 175B). Second, we evaluated seven structured prompting strategies, including zero-shot, few-shot, chain-of-thought, role-based, sequential, and combinations. Generated items were assessed using automated metrics and expert scoring across five dimensions. We also used GPT-4.1, trained on expert-rated samples, to simulate human scoring at scale. Results show that structured prompting, especially strategies combining chain-of-thought and sequential design, significantly improved Gemma's outputs. Gemma generally produced more construct-aligned and instructionally appropriate items than GPT-3.5's zero-shot responses, with prompt design playing a key role in mid-size model performance. This study demonstrates that structured prompting and efficient fine-tuning can enhance midsized models for AIG under limited data conditions. We highlight the value of combining automated metrics, expert judgment, and large-model simulation to ensure alignment with assessment goals. The proposed workflow offers a practical and scalable way to develop and validate language assessment items for K-12.
arXiv.org Artificial Intelligence
Aug-29-2025
- Country:
- Asia > Middle East
- UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > United Kingdom
- England > Greater London > London (0.04)
- North America > United States
- Florida > Alachua County
- Gainesville (0.14)
- New York > New York County
- New York City (0.04)
- Florida > Alachua County
- Asia > Middle East
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Education > Educational Setting > K-12 Education (1.00)
- Technology: