Initialization Matters: Unraveling the Impact of Pre-Training on Federated Learning

Jhunjhunwala, Divyansh, Sharma, Pranay, Xu, Zheng, Joshi, Gauri

Feb-11-2025–arXiv.org Artificial Intelligence

Initializing with pre-trained models when learning on downstream tasks is becoming standard practice in machine learning. Several recent works explore the benefits of pre-trained initialization in a federated learning (FL) setting, where the downstream training is performed at the edge clients with heterogeneous data distribution. These works show that starting from a pre-trained model can substantially reduce the adverse impact of data heterogeneity on the test performance of a model trained in a federated setting, with no changes to the standard FedAvg training algorithm. In this work, we provide a deeper theoretical understanding of this phenomenon. To do so, we study the class of two-layer convolutional neural networks (CNNs) and provide bounds on the training error convergence and test error of such a network trained with FedAvg. We introduce the notion of aligned and misaligned filters at initialization and show that the data heterogeneity only affects learning on misaligned filters. Starting with a pre-trained model typically results in fewer misaligned filters at initialization, thus producing a lower test error even when the model is trained in a federated setting with data heterogeneity. Experiments in synthetic settings and practical FL training on CNNs verify our theoretical findings.

artificial intelligence, initialization matter, machine learning, (14 more...)

arXiv.org Artificial Intelligence

Feb-11-2025

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America
  - United States
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe
  - Poland (0.04)
  - Italy > Sardinia (0.04)
- Asia > China
  - Shanghai > Shanghai (0.04)

Genre:
- Research Report > New Finding (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found