federated training
FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models Supplementary Materials 1 Dataset 1.1 Links and Preservation
The croissant metadata record is available at croissant. We chose GitHub and Google Drive respectively to store our code and dataset. Both are widely recognized as reliable data storage platforms, ensuring long-term preservation. We highly recommend downloading the raw data directly and following the provided instructions to simplify the data processing steps. Our dataset is structured as follows: the local directory contains client-specific data for local training, while all clients aggregates data from all clients for federated learning.
- North America > United States > Maryland (0.04)
- Asia > Middle East > Jordan (0.04)
- Education (0.68)
- Government > Military (0.67)
- Health & Medicine > Therapeutic Area (0.46)
FedGCN: Convergence-Communication Tradeoffs in Federated Training of Graph Convolutional Networks
Methods for training models on graphs distributed across multiple clients have recently grown in popularity, due to the size of these graphs as well as regulations on keeping data where it is generated. However, the cross-client edges naturally exist among clients. Thus, distributed methods for training a model on a single graph incur either significant communication overhead between clients or a loss of available information to the training. We introduce the Federated Graph Convolutional Network (FedGCN) algorithm, which uses federated learning to train GCN models for semi-supervised node classification with fast convergence and little communication. Compared to prior methods that require extra communication among clients at each training round, FedGCN clients only communicate with the central server in one pre-training step, greatly reducing communication costs and allowing the use of homomorphic encryption to further enhance privacy. We theoretically analyze the tradeoff between FedGCN's convergence rate and communication cost under different data distributions. Experimental results show that our FedGCN algorithm achieves better model accuracy with 51.7\% faster convergence on average and at least 100$\times$ less communication compared to prior work.
FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models Supplementary Materials 1 Dataset 1.1 Links and Preservation
The croissant metadata record is available at croissant. We chose GitHub and Google Drive respectively to store our code and dataset. Both are widely recognized as reliable data storage platforms, ensuring long-term preservation. We highly recommend downloading the raw data directly and following the provided instructions to simplify the data processing steps. Our dataset is structured as follows: the local directory contains client-specific data for local training, while all clients aggregates data from all clients for federated learning.
- North America > United States > Maryland (0.04)
- Asia > Middle East > Jordan (0.04)
- Government > Military (0.67)
- Health & Medicine > Therapeutic Area (0.46)
FedAgentBench: Towards Automating Real-world Federated Medical Image Analysis with Server-Client LLM Agents
Saha, Pramit, Strong, Joshua, Mishra, Divyanshu, Ouyang, Cheng, Noble, J. Alison
Federated learning (FL) allows collaborative model training across healthcare sites without sharing sensitive patient data. However, real-world FL deployment is often hindered by complex operational challenges that demand substantial human efforts. This includes: (a) selecting appropriate clients (hospitals), (b) coordinating between the central server and clients, (c) client-level data pre-processing, (d) harmonizing non-standardized data and labels across clients, and (e) selecting FL algorithms based on user instructions and cross-client data characteristics. However, the existing FL works overlook these practical orchestration challenges. These operational bottlenecks motivate the need for autonomous, agent-driven FL systems, where intelligent agents at each hospital client and the central server agent collaboratively manage FL setup and model training with minimal human intervention. To this end, we first introduce an agent-driven FL framework that captures key phases of real-world FL workflows from client selection to training completion and a benchmark dubbed FedAgentBench that evaluates the ability of LLM agents to autonomously coordinate healthcare FL. Our framework incorporates 40 FL algorithms, each tailored to address diverse task-specific requirements and cross-client characteristics. Furthermore, we introduce a diverse set of complex tasks across 201 carefully curated datasets, simulating 6 modality-specific real-world healthcare environments, viz., Dermatoscopy, Ultrasound, Fundus, Histopathology, MRI, and X-Ray. We assess the agentic performance of 14 open-source and 10 proprietary LLMs spanning small, medium, and large model scales. While some agent cores such as GPT-4.1 and DeepSeek V3 can automate various stages of the FL pipeline, our results reveal that more complex, interdependent tasks based on implicit goals remain challenging for even the strongest models.
- North America > United States > Virginia (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > China > Hong Kong (0.04)
- Workflow (0.90)
- Research Report > New Finding (0.34)
- Health & Medicine > Therapeutic Area > Oncology (0.94)
- Health & Medicine > Diagnostic Medicine (0.69)
Efficient Training of Large-Scale AI Models Through Federated Mixture-of-Experts: A System-Level Approach
Chen, Xiaobing, Zhang, Boyang, Zhou, Xiangwei, Sun, Mingxuan, Zhang, Shuai, Zhang, Songyang, Li, Geoffrey Ye
The integration of Federated Learning (FL) and Mixture-of-Experts (MoE) presents a compelling pathway for training more powerful, large-scale artificial intelligence models (LAMs) on decentralized data while preserving privacy. However, efficient federated training of these complex MoE-structured LAMs is hindered by significant system-level challenges, particularly in managing the interplay between heterogeneous client resources and the sophisticated coordination required for numerous specialized experts. This article highlights a critical, yet underexplored concept: the absence of robust quantitative strategies for dynamic client-expert alignment that holistically considers varying client capacities and the imperative for system-wise load balancing. Specifically, we propose a conceptual system design for intelligent client-expert alignment that incorporates dynamic fitness scoring, global expert load monitoring, and client capacity profiling. By tackling these systemic issues, we can unlock more scalable, efficient, and robust training mechanisms {with fewer communication rounds for convergence}, paving the way for the widespread deployment of large-scale federated MoE-structured LAMs in edge computing with ultra-high communication efficiency.
- North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.15)
- North America > United States > New Jersey > Essex County > Newark (0.04)
- North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
- (8 more...)
- Education (0.68)
- Information Technology > Security & Privacy (0.47)
Local Data Quantity-Aware Weighted Averaging for Federated Learning with Dishonest Clients
Wu, Leming, Jin, Yaochu, Hao, Kuangrong, Yu, Han
Federated learning (FL) enables collaborative training of deep learning models without requiring data to leave local clients, thereby preserving client privacy. The aggregation process on the server plays a critical role in the performance of the resulting FL model. The most commonly used aggregation method is weighted averaging based on the amount of data from each client, which is thought to reflect each client's contribution. However, this method is prone to model bias, as dishonest clients might report inaccurate training data volumes to the server, which is hard to verify. To address this issue, we propose a novel secure \underline{Fed}erated \underline{D}ata q\underline{u}antity-\underline{a}ware weighted averaging method (FedDua). It enables FL servers to accurately predict the amount of training data from each client based on their local model gradients uploaded. Furthermore, it can be seamlessly integrated into any FL algorithms that involve server-side model aggregation. Extensive experiments on three benchmarking datasets demonstrate that FedDua improves the global model performance by an average of 3.17% compared to four popular FL aggregation methods in the presence of inaccurate client data volume declarations.
- Asia > Singapore (0.05)
- Asia > China > Shanghai > Shanghai (0.05)
- North America > United States > Virginia (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)