Fine-tuning Large Language Models with Human-inspired Learning Strategies in Medical Question Answering

Yang, Yushi, Bean, Andrew M., McCraith, Robert, Mahdi, Adam

Aug-14-2024–arXiv.org Artificial Intelligence

Despite evidence that fine-tuning with curriculum learning improves the performance of LLMs for natural language understanding tasks, its effectiveness is typically assessed using a single model. In this work, we extend previous research by evaluating both curriculum-based and non-curriculum-based learning strategies across multiple LLMs, using human-defined and automated data labels for medical question answering. Our results indicate a moderate impact of using human-inspired learning strategies for fine-tuning LLMs, with maximum accuracy gains of 1.77% per model and 1.81% per dataset. Crucially, we demonstrate that the effectiveness of these strategies varies significantly across different model-dataset combinations, emphasising that the benefits of a specific human-inspired strategy for fine-tuning LLMs do not generalise. Additionally, we find evidence that curriculum learning using LLM-defined question difficulty outperforms human-defined difficulty, highlighting the potential of using model-generated measures for optimal curriculum design.

category, dataset, learning strategy, (11 more...)

arXiv.org Artificial Intelligence

Aug-14-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - Washington > King County
    - Seattle (0.04)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Health & Medicine (1.00)
- Education > Curriculum (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (0.70)
    - Statistical Learning > Clustering (0.46)