UnibucLLM: Harnessing LLMs for Automated Prediction of Item Difficulty and Response Time for Multiple-Choice Questions

Rogoz, Ana-Cristina, Ionescu, Radu Tudor

Apr-20-2024–arXiv.org Artificial Intelligence

This work explores a novel data augmentation method based on Large Language Models (LLMs) for predicting item difficulty and response time of retired USMLE Multiple-Choice Questions (MCQs) in the BEA 2024 Shared Task. Our approach is based on augmenting the dataset with answers from zero-shot LLMs (Falcon, Meditron, Mistral) and employing transformer-based models based on six alternative feature combinations. The results suggest that predicting the difficulty of questions is more challenging. Notably, our top performing methods consistently include the question text, and benefit from the variability of LLM answers, highlighting the potential of LLMs for improving automated assessment in medical licensing exams. We make our code available https://github.com/ana-rogoz/BEA-2024.

dataset, llm, response time, (14 more...)

arXiv.org Artificial Intelligence

Apr-20-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Washington > King County
      - Seattle (0.04)
    - New York > New York County
      - New York City (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
- Europe
  - Romania > București - Ilfov Development Region
    - Municipality of Bucharest > Bucharest (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Education (0.62)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found