Federated Clustering via Matrix Factorization Models: From Model Averaging to Gradient Sharing

Feb-12-2020–arXiv.org Machine Learning

Recently, federated learning (FL) has drawn significant attention due to its capability of training a model over the network without knowing the client's private raw data. In this paper, we study the unsupervised clustering problem under the FL setting. By adopting a generalized matrix factorization model for clustering, we propose two novel (first-order) federated clustering (FedC) algorithms based on principles of model averaging and gradient sharing, respectively, and present their theoretical convergence conditions. We show that both algorithms have a O(1/T) convergence rate, where T is the total number of gradient evaluations per client, and the communication cost can be effectively reduced by controlling the local epoch length and allowing partial client participation within each communication round. Numerical experiments show that the FedC algorithm based on gradient sharing outperforms that based on model averaging, especially in scenarios with non-i.i.d. I. INTRODUCTION As one of the most fundamental data mining tasks, unsupervised clustering has a vast range of applications [1]. In view of the increasing volume of real-life data, distributed clustering methods that can process large-scale datasets in parallel computing environments have gained significant interests in the last decade [2], [3], [4]. However, recent emphasis on user privacy has called for new distributed schemes that can perform clustering without directly accessing the users' raw data.

algorithm, dataset, fedcgd, (14 more...)

arXiv.org Machine Learning

Feb-12-2020

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales > Sydney (0.04)
  - Queensland > Brisbane (0.04)
- North America
  - United States
    - Oregon (0.04)
    - Louisiana (0.04)
    - Pennsylvania > Philadelphia County
      - Philadelphia (0.04)
    - New York > New York County
      - New York City (0.04)
    - Massachusetts > Middlesex County
      - Belmont (0.04)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - Florida > Palm Beach County
      - Boca Raton (0.04)
    - California
      - San Diego County > San Diego (0.04)
      - Los Angeles County > Long Beach (0.04)
  - Canada
    - Quebec > Montreal (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Europe
  - United Kingdom > England
    - Greater London > London (0.04)
  - Middle East > Republic of Türkiye
    - Istanbul Province > Istanbul (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Finland > Uusimaa
    - Helsinki (0.04)
- Asia
  - Middle East
    - Jordan (0.04)
    - Republic of Türkiye > Istanbul Province
      - Istanbul (0.04)
  - China
    - Guangdong Province > Shenzhen (0.04)
    - Hong Kong (0.04)

Genre:
- Research Report (0.63)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found