Federated Clustering via Matrix Factorization Models: From Model Averaging to Gradient Sharing

Wang, Shuai, Chang, Tsung-Hui

arXiv.org Machine Learning 

Recently, federated learning (FL) has drawn significant attention due to its capability of training a model over the network without knowing the client's private raw data. In this paper, we study the unsupervised clustering problem under the FL setting. By adopting a generalized matrix factorization model for clustering, we propose two novel (first-order) federated clustering (FedC) algorithms based on principles of model averaging and gradient sharing, respectively, and present their theoretical convergence conditions. We show that both algorithms have a O(1/T) convergence rate, where T is the total number of gradient evaluations per client, and the communication cost can be effectively reduced by controlling the local epoch length and allowing partial client participation within each communication round. Numerical experiments show that the FedC algorithm based on gradient sharing outperforms that based on model averaging, especially in scenarios with non-i.i.d. I. INTRODUCTION As one of the most fundamental data mining tasks, unsupervised clustering has a vast range of applications [1]. In view of the increasing volume of real-life data, distributed clustering methods that can process large-scale datasets in parallel computing environments have gained significant interests in the last decade [2], [3], [4]. However, recent emphasis on user privacy has called for new distributed schemes that can perform clustering without directly accessing the users' raw data.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found