Fine-tuning Large Language Models with Human-inspired Learning Strategies in Medical Question Answering
Yang, Yushi, Bean, Andrew M., McCraith, Robert, Mahdi, Adam
–arXiv.org Artificial Intelligence
Despite evidence that fine-tuning with curriculum learning improves the performance of LLMs for natural language understanding tasks, its effectiveness is typically assessed using a single model. In this work, we extend previous research by evaluating both curriculum-based and non-curriculum-based learning strategies across multiple LLMs, using human-defined and automated data labels for medical question answering. Our results indicate a moderate impact of using human-inspired learning strategies for fine-tuning LLMs, with maximum accuracy gains of 1.77% per model and 1.81% per dataset. Crucially, we demonstrate that the effectiveness of these strategies varies significantly across different model-dataset combinations, emphasising that the benefits of a specific human-inspired strategy for fine-tuning LLMs do not generalise. Additionally, we find evidence that curriculum learning using LLM-defined question difficulty outperforms human-defined difficulty, highlighting the potential of using model-generated measures for optimal curriculum design.
arXiv.org Artificial Intelligence
Aug-14-2024
- Country:
- North America > United States
- New York (0.04)
- Washington > King County
- Seattle (0.04)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine (1.00)
- Education > Curriculum (0.34)
- Technology: