Goto

Collaborating Authors

 federated multi-task learning


Federated Multi-Task Learning under a Mixture of Distributions

Neural Information Processing Systems

The increasing size of data generated by smartphones and IoT devices motivated the development of Federated Learning (FL), a framework for on-device collaborative training of machine learning models. First efforts in FL focused on learning a single global model with good average performance across clients, but the global model may be arbitrarily bad for a given client, due to the inherent heterogeneity of local data distributions. Federated multi-task learning (MTL) approaches can learn personalized models by formulating an opportune penalized optimization problem. The penalization term can capture complex relations among personalized models, but eschews clear statistical assumptions about local data distributions. In this work, we propose to study federated MTL under the flexible assumption that each local data distribution is a mixture of unknown underlying distributions. This assumption encompasses most of the existing personalized FL approaches and leads to federated EM-like algorithms for both client-server and fully decentralized settings. Moreover, it provides a principled way to serve personalized models to clients not seen at training time. The algorithms' convergence is analyzed through a novel federated surrogate optimization framework, which can be of general interest. Experimental results on FL benchmarks show that our approach provides models with higher accuracy and fairness than state-of-the-art methods.


Federated Multi-Task Learning

Neural Information Processing Systems

Federated learning poses new statistical and systems challenges in training machine learning models over distributed networks of devices. In this work, we show that multi-task learning is naturally suited to handle the statistical challenges of this setting, and propose a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Our method and theory for the first time consider issues of high communication cost, stragglers, and fault tolerance for distributed multi-task learning. The resulting method achieves significant speedups compared to alternatives in the federated setting, as we demonstrate through simulations on real-world federated datasets.


Towards Unified Modeling in Federated Multi-Task Learning via Subspace Decoupling

Wei, Yipan, Zou, Yuchen, Li, Yapeng, Du, Bo

arXiv.org Artificial Intelligence

Federated Multi-Task Learning (FMTL) enables multiple clients performing heterogeneous tasks without exchanging their local data, offering broad potential for privacy preserving multi-task collaboration. However, most existing methods focus on building personalized models for each client and unable to support the aggregation of multiple heterogeneous tasks into a unified model. As a result, in real-world scenarios where task objectives, label spaces, and optimization paths vary significantly, conventional FMTL methods struggle to achieve effective joint training. To address this challenge, we propose FedDEA (Federated Decoupled Aggregation), an update-structure-aware aggregation method specifically designed for multi-task model integration. Our method dynamically identifies task-relevant dimensions based on the response strength of local updates and enhances their optimization effectiveness through rescaling. This mechanism effectively suppresses cross-task interference and enables task-level decoupled aggregation within a unified global model. FedDEA does not rely on task labels or architectural modifications, making it broadly applicable and deployment-friendly. Experimental results demonstrate that it can be easily integrated into various mainstream federated optimization algorithms and consistently delivers significant overall performance improvements on widely used NYUD-V2 and PASCAL-Context. These results validate the robustness and generalization capabilities of FedDEA under highly heterogeneous task settings.


MIRA: A Method of Federated MultI-Task Learning for LaRge LAnguage Models

Elbakary, Ahmed, Issaid, Chaouki Ben, ElBatt, Tamer, Seddik, Karim, Bennis, Mehdi

arXiv.org Artificial Intelligence

In this paper, we introduce a method for fine-tuning Large Language Models (LLMs), inspired by Multi-Task learning in a federated manner. Our approach leverages the structure of each client's model and enables a learning scheme that considers other clients' tasks and data distribution. To mitigate the extensive computational and communication overhead often associated with LLMs, we utilize a parameter-efficient fine-tuning method, specifically Low-Rank Adaptation (LoRA), reducing the number of trainable parameters. Experimental results, with different datasets and models, demonstrate the proposed method's effectiveness compared to existing frameworks for federated fine-tuning of LLMs in terms of average and local performances. The proposed scheme outperforms existing baselines by achieving lower local loss for each client while maintaining comparable global performance.


Federated Multi-Task Learning under a Mixture of Distributions

Neural Information Processing Systems

The increasing size of data generated by smartphones and IoT devices motivated the development of Federated Learning (FL), a framework for on-device collaborative training of machine learning models. First efforts in FL focused on learning a single global model with good average performance across clients, but the global model may be arbitrarily bad for a given client, due to the inherent heterogeneity of local data distributions. Federated multi-task learning (MTL) approaches can learn personalized models by formulating an opportune penalized optimization problem. The penalization term can capture complex relations among personalized models, but eschews clear statistical assumptions about local data distributions. In this work, we propose to study federated MTL under the flexible assumption that each local data distribution is a mixture of unknown underlying distributions.


Reviews: Federated Multi-Task Learning

Neural Information Processing Systems

The paper generalizes an existing framework for distributed multitask learning (COCOA) and extends it to handle practical systems challenges such as communications cost, stragglers, etc. The proposed technique (MOCHA) makes the framework robust and faster. Pros: - Rigorous theoretical analysis - Sound experiment methodology Cons: - Not very novel (compared to [47]) and just a mild variation over [47] including the analysis. The Equation 5 depends on \Delta \alpha_t* (minimizer of sub-problem). My understanding is that this minimizer is not known and cannot be computed accurately at every node, and therefore, we need the approximation.


Federated Multi-Task Learning on Non-IID Data Silos: An Experimental Study

Yang, Yuwen, Lu, Yuxiang, Huang, Suizhi, Sirejiding, Shalayiding, Lu, Hongtao, Ding, Yue

arXiv.org Artificial Intelligence

The innovative Federated Multi-Task Learning (FMTL) approach consolidates the benefits of Federated Learning (FL) and Multi-Task Learning (MTL), enabling collaborative model training on multi-task learning datasets. However, a comprehensive evaluation method, integrating the unique features of both FL and MTL, is currently absent in the field. This paper fills this void by introducing a novel framework, FMTL-Bench, for systematic evaluation of the FMTL paradigm. This benchmark covers various aspects at the data, model, and optimization algorithm levels, and comprises seven sets of comparative experiments, encapsulating a wide array of non-independent and identically distributed (Non-IID) data partitioning scenarios. We propose a systematic process for comparing baselines of diverse indicators and conduct a case study on communication expenditure, time, and energy consumption. Through our exhaustive experiments, we aim to provide valuable insights into the strengths and limitations of existing baseline methods, contributing to the ongoing discourse on optimal FMTL application in practical scenarios. The source code can be found on https://github.com/youngfish42/FMTL-Benchmark .


Federated Multi-Task Learning for THz Wideband Channel and DoA Estimation

Elbir, Ahmet M., Shi, Wei, Mishra, Kumar Vijay, Chatzinotas, Symeon

arXiv.org Artificial Intelligence

This paper addresses two major challenges in terahertz (THz) channel estimation: the beam-split phenomenon, i.e., beam misalignment because of frequency-independent analog beamformers, and computational complexity because of the usage of ultra-massive number of antennas to compensate propagation losses. Data-driven techniques are known to mitigate the complexity of this problem but usually require the transmission of the datasets from the users to a central server entailing huge communication overhead. In this work, we introduce a federated multi-task learning (FMTL), wherein the users transmit only the model parameters instead of the whole dataset, for THz channel and user direction-of-arrival (DoA) estimation to improve the communications-efficiency. We first propose a novel beamspace support alignment technique for channel estimation with beam-split correction. Then, the channel and DoA information are used as labels to train an FMTL model. By exploiting the sparsity of the THz channel, the proposed approach is implemented with fewer pilot signals than the traditional techniques. Compared to the previous works, our FMTL approach provides higher channel estimation accuracy as well as approximately 25 (32) times lower model (channel) training overhead, respectively.


Federated Multi-Task Learning

Smith, Virginia, Chiang, Chao-Kai, Sanjabi, Maziar, Talwalkar, Ameet S.

Neural Information Processing Systems

Federated learning poses new statistical and systems challenges in training machine learning models over distributed networks of devices. In this work, we show that multi-task learning is naturally suited to handle the statistical challenges of this setting, and propose a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Our method and theory for the first time consider issues of high communication cost, stragglers, and fault tolerance for distributed multi-task learning. The resulting method achieves significant speedups compared to alternatives in the federated setting, as we demonstrate through simulations on real-world federated datasets. Papers published at the Neural Information Processing Systems Conference.