Large Language Models as Attribution Regularizers for Efficient Model Training

Vukadin, Davor, Šilić, Marin, Delač, Goran

Feb-27-2025–arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated remarkable performance across diverse domains. However, effectiv ely leveraging their vast knowledge for training smaller downstream model s remains an open challenge, especially in domains like tabular data lea rning, where simpler models are often preferred due to interpretability and efficiency. In this paper, we introduce a novel yet straightforward meth od for incorporating LLM-generated global task feature attributions i nto the training process of smaller networks. Specifically, we propose an attribution-matching regularization term that aligns the training dyna mics of the smaller model with the insights provided by the LLM. By doing so, our approach yields superior performance in few-shot learn ing scenarios. Notably, our method requires only black-box API access to th e LLM, making it easy to integrate into existing training pipeline s with minimal computational overhead. Furthermore, we demonstrate how this method can be used to ad dress common issues in real-world datasets, such as skewness and b ias. By integrating high-level knowledge from LLMs, our approach i mproves generalization, even when training data is limited or imbal anced. We validate its effectiveness through extensive experiments a cross multiple tasks, demonstrating improved learning efficiency and model robustness.

conference, dataset, language model, (15 more...)

arXiv.org Artificial Intelligence

Feb-27-2025

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.28)
- Africa > Rwanda (0.14)
- North America > United States
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
  - California > San Francisco County
    - San Francisco (0.14)
- Europe
  - Austria > Vienna (0.14)
  - Croatia (0.14)
  - Spain (0.14)
  - Italy (0.14)
- Asia > Middle East
  - UAE (0.14)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Health & Medicine > Therapeutic Area (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found