Cross-model Control: Improving Multiple Large Language Models in One-time Training

May-27-2025, 09:57:14 GMT–Neural Information Processing Systems

The number of large language models (LLMs) with varying parameter scales and vocabularies is increasing. While they deliver powerful performance, they also face a set of common optimization needs to meet specific requirements or standards, such as instruction following or avoiding the output of sensitive information from the real world. However, how to reuse the fine-tuning outcomes of one model to other models to reduce training costs remains a challenge. To bridge this gap, we introduce Cross-model Control (CMC), a method that improves multiple LLMs in one-time training with a portable tiny language model. Specifically, we have observed that the logit shift before and after fine-tuning is remarkably similar across different models.

cross-model control, language model, tiny language model, (3 more...)

Neural Information Processing Systems

May-27-2025, 09:57:14 GMT

Conferences Web Page

Add feedback

Country:
- Asia > Myanmar > Tanintharyi Region > Dawei (0.08)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)