Local Superior Soups: A Catalyst for Model Merging in Cross-Silo Federated Learning

Chen, Minghui, Jiang, Meirui, Zhang, Xin, Dou, Qi, Wang, Zehua, Li, Xiaoxiao

Oct-31-2024–arXiv.org Artificial Intelligence

Federated learning (FL) is a learning paradigm that enables collaborative training of models using decentralized data. Recently, the utilization of pre-trained weight initialization in FL has been demonstrated to effectively improve model performance. However, the evolving complexity of current pre-trained models, characterized by a substantial increase in parameters, markedly intensifies the challenges associated with communication rounds required for their adaptation to FL. To address these communication cost issues and increase the performance of pre-trained model adaptation in FL, we propose an innovative model interpolation-based local training technique called ``Local Superior Soups.'' Our method enhances local training across different clients, encouraging the exploration of a connected low-loss basin within a few communication rounds through regularized model interpolation. This approach acts as a catalyst for the seamless adaptation of pre-trained models in in FL. We demonstrated its effectiveness and efficiency across diverse widely-used FL datasets. Our code is available at \href{https://github.com/ubc-tea/Local-Superior-Soups}{https://github.com/ubc-tea/Local-Superior-Soups}.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-31-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada (0.14)
  - United States (0.14)

Genre:
- Research Report
  - Experimental Study (1.00)
  - Promising Solution (0.66)

Industry:
- Education (0.55)
- Information Technology (0.67)
- Materials > Chemicals
  - Specialty Chemicals (0.60)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (0.93)