Mutual Enhancement of Large and Small Language Models with Cross-Silo Knowledge Transfer
Deng, Yongheng, Qiao, Ziqing, Ren, Ju, Liu, Yang, Zhang, Yaoxue
–arXiv.org Artificial Intelligence
While large language models (LLMs) are empowered with broad knowledge, their task-specific performance is often suboptimal. It necessitates fine-tuning LLMs with task-specific data, but such data may be inaccessible due to privacy concerns. In this paper, we propose a novel approach to enhance LLMs with smaller language models (SLMs) that are trained on clients using their private task-specific data. To enable mutual enhancement between LLMs and SLMs, we propose CrossLM, where the SLMs promote the LLM to generate task-specific high-quality data, and both the LLM and SLMs are enhanced with the generated data. We evaluate CrossLM using publicly accessible language models across a range of benchmark tasks. The results demonstrate that CrossLM significantly enhances the task-specific performance of SLMs on clients and the LLM on the cloud server simultaneously while preserving the LLM's generalization capability.
arXiv.org Artificial Intelligence
Dec-10-2023
- Genre:
- Overview > Innovation (0.34)
- Research Report
- New Finding (0.48)
- Promising Solution (0.34)
- Industry:
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (1.00)
- Technology: