FreeLM: Fine-Tuning-Free Language Model
Li, Xiang, Jiang, Xin, Meng, Xuying, Sun, Aixin, Wang, Yequan
–arXiv.org Artificial Intelligence
Pre-trained language models (PLMs) have achieved remarkable success in NLP tasks. Despite the great success, mainstream solutions largely follow the pre-training then finetuning paradigm, which brings in both high deployment costs and low training efficiency. Nevertheless, fine-tuning on a specific task is essential because PLMs are only pre-trained with language signal from large raw data. In this paper, we propose a novel fine-tuning-free strategy for language models, to consider both language signal and teacher signal. Teacher signal is an abstraction of a battery of downstream tasks, provided in a unified proposition format. Trained with both language and strong task-aware teacher signals in an interactive manner, our FreeLM model demonstrates strong generalization and robustness. FreeLM outperforms large models e.g., GPT-3 and InstructGPT, on a range of language understanding tasks in experiments. FreeLM is much smaller with 0.3B parameters, compared to 175B in these models.
arXiv.org Artificial Intelligence
May-2-2023
- Country:
- Asia
- China > Beijing
- Beijing (0.04)
- Singapore (0.04)
- South Korea (0.04)
- China > Beijing
- Europe
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Maryland (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- New York City (0.04)
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- Canada
- Asia
- Genre:
- Research Report (1.00)
- Technology: