Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning with Heterogeneous LoRA Allocation
Zhang, Zikai, Liu, Ping, Xu, Jiahao, Hu, Rui
–arXiv.org Artificial Intelligence
--Federated Learning (FL) has recently been utilized to collaboratively fine-tune foundation models (FMs) across multiple clients. Notably, federated low-rank adaptation (LoRA)- based fine-tuning methods have recently gained attention, which allows clients to fine-tune FMs with a small portion of train-able parameters locally. However, most existing methods do not account for the heterogeneous resources of clients or lack an effective local training strategy to maximize global fine-tuning performance under limited resources. In this work, we propose Fed-HeLLo, a novel federated LoRA-based fine-tuning framework that enables clients to collaboratively fine-tune an FM with different local trainable LoRA layers. T o ensure its effectiveness, we develop several heterogeneous LoRA allocation (HLA) strategies that adaptively allocate local trainable LoRA layers based on clients' resource capabilities and the layer importance. Specifically, based on the dynamic layer importance, we design a Fisher Information Matrix score-based HLA (FIM-HLA) that leverages dynamic gradient norm information. T o better stabilize the training process, we consider the intrinsic importance of LoRA layers and design a Geometrically-Defined HLA (GD-HLA) strategy. It shapes the collective distribution of trainable LoRA layers into specific geometric patterns, such as Triangle, Inverted Triangle, Bottleneck, and Uniform. Moreover, we extend GD-HLA into a randomized version, named Randomized Geometrically-Defined HLA (RGD-HLA), for enhanced model accuracy with randomness. By co-designing the proposed HLA strategies, we incorporate both the dynamic and intrinsic layer importance into the design of our HLA strategy. T o thoroughly evaluate our approach, we simulate various complex federated LoRA-based fine-tuning settings using five datasets and three levels of data distributions ranging from IID to extreme Non-IID. The experimental results demonstrate the effectiveness and efficiency of Fed-HeLLo with the proposed HLA strategies. OUNDA TION models (FMs) [13], [16], [36], [37], [68], characterized by their extensive parameter counts ranging into millions or billions, serve as robust initial weights for a variety of downstream tasks [47], [52] via fine-tuning. However, employing FMs presents substantial challenges, especially the high computational costs of fine-tuning the model. To mitigate the high computational requirement of fine-tuning FMs, researchers have developed a variety of parameter-efficient fine-tuning (PEFT) methods.
arXiv.org Artificial Intelligence
Jun-17-2025
- Country:
- North America > United States > Nevada (0.28)
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Information Technology (0.46)
- Education (0.34)
- Technology: