In-Context Learning Distillation for Efficient Few-Shot Fine-Tuning

Duan, Yifei, Li, Liu, Zhai, Zirui, Yao, Jinxia

Dec-17-2024–arXiv.org Artificial Intelligence

Conventional solutions to few-shot learning model for the natural language inference task and employed generally fall into two categories: weights-updating knowledge distillation to internalize the context information, fine-tuning and prompt-based context learning. Each approach reducing model parameter from 1.3B to 125M and has significant limitations, particularly when scaling achieving a size reduction from 2.5GB to 0.25GB. Compared to larger models or deploying in resource-constrained to using in-context learning alone on similarly sized environments. Fine-tuning requires updating some or all models, this context distillation approach achieved a nearly model parameters, leading to high computational costs and 50% improvement in out-of-domain accuracy, demonstrating potential catastrophic forgetting.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Dec-17-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Large Language Model (0.48)