Large Language Models as Attribution Regularizers for Efficient Model Training

Vukadin, Davor, Šilić, Marin, Delač, Goran

arXiv.org Artificial Intelligence 

Large Language Models (LLMs) have demonstrated remarkable performance across diverse domains. However, effectiv ely leveraging their vast knowledge for training smaller downstream model s remains an open challenge, especially in domains like tabular data lea rning, where simpler models are often preferred due to interpretability and efficiency. In this paper, we introduce a novel yet straightforward meth od for incorporating LLM-generated global task feature attributions i nto the training process of smaller networks. Specifically, we propose an attribution-matching regularization term that aligns the training dyna mics of the smaller model with the insights provided by the LLM. By doing so, our approach yields superior performance in few-shot learn ing scenarios. Notably, our method requires only black-box API access to th e LLM, making it easy to integrate into existing training pipeline s with minimal computational overhead. Furthermore, we demonstrate how this method can be used to ad dress common issues in real-world datasets, such as skewness and b ias. By integrating high-level knowledge from LLMs, our approach i mproves generalization, even when training data is limited or imbal anced. We validate its effectiveness through extensive experiments a cross multiple tasks, demonstrating improved learning efficiency and model robustness.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found