Guo, Yongxin
FedCCRL: Federated Domain Generalization with Cross-Client Representation Learning
Wang, Xinpeng, Guo, Yongxin, Tang, Xiaoying
Domain Generalization (DG) aims to train models that can effectively generalize to unseen domains. However, in the context of Federated Learning (FL), where clients collaboratively train a model without directly sharing their data, most existing DG algorithms are not directly applicable to the FL setting due to privacy constraints, as well as the limited data quantity and domain diversity at each client. To tackle these challenges, we propose FedCCRL, a lightweight federated domain generalization method that significantly improves the model's generalization ability while preserving privacy and ensuring computational and communication efficiency. Specifically, FedCCRL comprises two principal modules: the first is a cross-client feature extension module, which increases local domain diversity via cross-client domain transfer and domain-invariant feature perturbation; the second is a representation and prediction dual-stage alignment module, which enables the model to effectively capture domain-invariant features. Extensive experimental results demonstrate that FedCCRL achieves the state-of-the-art performance on the PACS, OfficeHome and miniDomainNet datasets across FL settings of varying numbers of clients. Code is available at https://github.com/sanphouwang/fedccrl
FedMABA: Towards Fair Federated Learning through Multi-Armed Bandits Allocation
Wang, Zhichao, Wang, Lin, Guo, Yongxin, Zhang, Ying-Jun Angela, Tang, Xiaoying
The increasing concern for data privacy has driven the rapid development of federated learning (FL), a privacy-preserving collaborative paradigm. However, the statistical heterogeneity among clients in FL results in inconsistent performance of the server model across various clients. Server model may show favoritism towards certain clients while performing poorly for others, heightening the challenge of fairness. In this paper, we reconsider the inconsistency in client performance distribution and introduce the concept of adversarial multi-armed bandit to optimize the proposed objective with explicit constraints on performance disparities. Practically, we propose a novel multi-armed bandit-based allocation FL algorithm (FedMABA) to mitigate performance unfairness among diverse clients with different data distributions. Extensive experiments, in different Non-I.I.D. scenarios, demonstrate the exceptional performance of FedMABA in enhancing fairness.
Computational Pathology for Accurate Prediction of Breast Cancer Recurrence: Development and Validation of a Deep Learning-based Tool
Su, Ziyu, Guo, Yongxin, Wesolowski, Robert, Tozbikian, Gary, O'Connell, Nathaniel S., Niazi, M. Khalid Khan, Gurcan, Metin N.
Accurate recurrence risk stratification is crucial for optimizing treatment plans for breast cancer patients. Current prognostic tools like Oncotype DX (ODX) offer valuable genomic insights for HR+/HER2-patients but are limited by cost and accessibility, particularly in underserved populations. In this study, we present Deep-BCR-Auto, a deep learning-based computational pathology approach that predicts breast cancer recurrence risk from routine H&E-stained whole slide images (WSIs). Our methodology was validated on two independent cohorts: the TCGA-BRCA dataset and an in-house dataset from The Ohio State University (OSU). Deep-BCR-Auto demonstrated robust performance in stratifying patients into low-and high-recurrence risk categories. On the TCGA-BRCA dataset, the model achieved an area under the receiver operating characteristic curve (AUROC) of 0.827, significantly outperforming existing weakly supervised models (p=0.041). In the independent OSU dataset, Deep-BCR-Auto maintained strong generalizability, achieving an AUROC of 0.832, along with 82.0% accuracy, 85.0% specificity, and 67.7% sensitivity. These findings highlight the potential of computational pathology as a cost-effective alternative for recurrence risk assessment, broadening access to personalized treatment strategies. This study underscores the clinical utility of integrating deep learning-based computational pathology into routine pathological assessment for breast cancer prognosis across diverse clinical settings. Keywords: Computational pathology, Breast cancer, Deep learning, Oncotype-DX, Image analysis 1. Introduction Breast cancer is the most prevalent cancer and the second biggest reason for cancer-related death in women in the United States [1]. The effective treatment options and prognosis for breast cancer patients are highly dependent on the patient's molecular subtype of breast cancer as determined by estrogen, progesterone, and human epidermal growth factor 2 (HER2) receptor expression. Among all different subtypes, hormone receptorpositive (HR+) and epidermal growth factor receptor-negative (HER2-) breast cancer represents the most common entity, accounting for approximately 65% of all cases [2, 3].
Smart Sampling: Helping from Friendly Neighbors for Decentralized Federated Learning
Wang, Lin, Chen, Yang, Guo, Yongxin, Tang, Xiaoying
Federated Learning (FL) is gaining widespread interest for its ability to share knowledge while preserving privacy and reducing communication costs. Unlike Centralized FL, Decentralized FL (DFL) employs a network architecture that eliminates the need for a central server, allowing direct communication among clients and leading to significant communication resource savings. However, due to data heterogeneity, not all neighboring nodes contribute to enhancing the local client's model performance. In this work, we introduce \textbf{\emph{AFIND+}}, a simple yet efficient algorithm for sampling and aggregating neighbors in DFL, with the aim of leveraging collaboration to improve clients' model performance. AFIND+ identifies helpful neighbors, adaptively adjusts the number of selected neighbors, and strategically aggregates the sampled neighbors' models based on their contributions. Numerical results on real-world datasets with diverse data partitions demonstrate that AFIND+ outperforms other sampling algorithms in DFL and is compatible with most existing DFL optimization algorithms.
Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing
Guo, Yongxin, Wang, Lin, Tang, Xiaoying, Lin, Tao
Federated Learning (FL) is a privacy-preserving distributed machine learning paradigm. Nonetheless, the substantial distribution shifts among clients pose a considerable challenge to the performance of current FL algorithms. To mitigate this challenge, various methods have been proposed to enhance the FL training process. This paper endeavors to tackle the issue of data heterogeneity from another perspective -- by improving FL algorithms prior to the actual training stage. Specifically, we introduce the Client2Vec mechanism, which generates a unique client index for each client before the commencement of FL training. Subsequently, we leverage the generated client index to enhance the subsequent FL training process. To demonstrate the effectiveness of the proposed Client2Vec method, we conduct three case studies that assess the impact of the client index on the FL training process. These case studies encompass enhanced client sampling, model aggregation, and local training. Extensive experiments conducted on diverse datasets and model architectures show the efficacy of Client2Vec across all three case studies. Our code is avaliable at \url{https://github.com/LINs-lab/client2vec}.
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Guo, Yongxin, Cheng, Zhenglin, Tang, Xiaoying, Lin, Tao
The Sparse Mixture of Experts (SMoE) has been widely employed to enhance the efficiency of training and inference for Transformer-based foundational models, yielding promising results. However, the performance of SMoE heavily depends on the choice of hyper-parameters, such as the number of experts and the number of experts to be activated (referred to as top-k), resulting in significant computational overhead due to the extensive model training by searching over various hyper-parameter configurations. As a remedy, we introduce the Dynamic Mixture of Experts (DynMoE) technique. DynMoE incorporates (1) a novel gating method that enables each token to automatically determine the number of experts to activate. (2) An adaptive process automatically adjusts the number of experts during training. Extensive numerical results across Vision, Language, and Vision-Language tasks demonstrate the effectiveness of our approach to achieve competitive performance compared to GMoE for vision and language tasks, and MoE-LLaVA for vision-language tasks, while maintaining efficiency by activating fewer parameters. Our code is available at https://github.com/LINs-lab/DynMoE.
Find Your Optimal Assignments On-the-fly: A Holistic Framework for Clustered Federated Learning
Guo, Yongxin, Tang, Xiaoying, Lin, Tao
Federated Learning (FL) is an emerging distributed machine learning approach that preserves client privacy by storing data on edge devices. However, data heterogeneity among clients presents challenges in training models that perform well on all local distributions. Recent studies have proposed clustering as a solution to tackle client heterogeneity in FL by grouping clients with distribution shifts into different clusters. However, the diverse learning frameworks used in current clustered FL methods make it challenging to integrate various clustered FL methods, gather their benefits, and make further improvements. To this end, this paper presents a comprehensive investigation into current clustered FL methods and proposes a four-tier framework, namely HCFL, to encompass and extend existing approaches. Based on the HCFL, we identify the remaining challenges associated with current clustering methods in each tier and propose an enhanced clustering method called HCFL+ to address these challenges. Through extensive numerical evaluations, we showcase the effectiveness of our clustering framework and the improved components. Our code will be publicly available.
FedRC: Tackling Diverse Distribution Shifts Challenge in Federated Learning by Robust Clustering
Guo, Yongxin, Tang, Xiaoying, Lin, Tao
Federated Learning (FL) is a machine learning paradigm that safeguards privacy by retaining client data on edge devices. However, optimizing FL in practice can be challenging due to the diverse and heterogeneous nature of the learning system. Though recent research has focused on improving the optimization of FL when distribution shifts occur among clients, ensuring global performance when multiple types of distribution shifts occur simultaneously among clients -- such as feature distribution shift, label distribution shift, and concept shift -- remain under-explored. In this paper, we identify the learning challenges posed by the simultaneous occurrence of diverse distribution shifts and propose a clustering principle to overcome these challenges. Through our research, we find that existing methods failed to address the clustering principle. Therefore, we propose a novel clustering algorithm framework, dubbed as FedRC, which adheres to our proposed clustering principle by incorporating a bi-level optimization problem and a novel objective function. Extensive experiments demonstrate that FedRC significantly outperforms other SOTA cluster-based FL methods. Our code will be publicly available.
FedBR: Improving Federated Learning on Heterogeneous Data via Local Learning Bias Reduction
Guo, Yongxin, Tang, Xiaoying, Lin, Tao
Federated Learning (FL) is a way for machines to learn from data that is kept locally, in order to protect the privacy of clients. This is typically done using local SGD, which helps to improve communication efficiency. However, such a scheme is currently constrained by slow and unstable convergence due to the variety of data on different clients' devices. In this work, we identify three under-explored phenomena of biased local learning that may explain these challenges caused by local updates in supervised FL. As a remedy, we propose FedBR, a novel unified algorithm that reduces the local learning bias on features and classifiers to tackle these challenges. FedBR has two components. The first component helps to reduce bias in local classifiers by balancing the output of the models. The second component helps to learn local features that are similar to global features, but different from those learned from other data sources. We conducted several experiments to test \algopt and found that it consistently outperforms other SOTA FL methods. Both of its components also individually show performance gains. Our code is available at https://github.com/lins-lab/fedbr.
Towards Federated Learning on Time-Evolving Heterogeneous Data
Guo, Yongxin, Lin, Tao, Tang, Xiaoying
Federated Learning (FL) is a learning paradigm that protects privacy by keeping client data on edge devices. However, optimizing FL in practice can be difficult due to the diversity and heterogeneity of the learning system. Despite recent research efforts to improve the optimization of heterogeneous data, the impact of time-evolving heterogeneous data in real-world scenarios, such as changing client data or intermittent clients joining or leaving during training, has not been studied well. In this work, we propose Continual Federated Learning (CFL), a flexible framework for capturing the time-evolving heterogeneity of FL. CFL can handle complex and realistic scenarios, which are difficult to evaluate in previous FL formulations, by extracting information from past local data sets and approximating local objective functions. We theoretically demonstrate that CFL methods have a faster convergence rate than FedAvg in time-evolving scenarios, with the benefit depending on approximation quality. Through experiments, we show that our numerical findings match the convergence analysis and that CFL methods significantly outperform other state-of-the-art FL baselines.