Goto

Collaborating Authors

 Chi, Chong-Yung


Differentially Private Federated Clustering over Non-IID Data

arXiv.org Artificial Intelligence

In this paper, we investigate federated clustering (FedC) problem, that aims to accurately partition unlabeled data samples distributed over massive clients into finite clusters under the orchestration of a parameter server, meanwhile considering data privacy. Though it is an NP-hard optimization problem involving real variables denoting cluster centroids and binary variables denoting the cluster membership of each data sample, we judiciously reformulate the FedC problem into a non-convex optimization problem with only one convex constraint, accordingly yielding a soft clustering solution. Then a novel FedC algorithm using differential privacy (DP) technique, referred to as DP-FedC, is proposed in which partial clients participation and multiple local model updating steps are also considered. Furthermore, various attributes of the proposed DP-FedC are obtained through theoretical analyses of privacy protection and convergence rate, especially for the case of non-identically and independently distributed (non-i.i.d.) data, that ideally serve as the guidelines for the design of the proposed DP-FedC. Then some experimental results on two real datasets are provided to demonstrate the efficacy of the proposed DP-FedC together with its much superior performance over some state-of-the-art FedC algorithms, and the consistency with all the presented analytical results.


Privacy-preserving Federated Primal-dual Learning for Non-convex and Non-smooth Problems with Model Sparsification

arXiv.org Artificial Intelligence

Federated learning (FL) has been recognized as a rapidly growing research area, where the model is trained over massively distributed clients under the orchestration of a parameter server (PS) without sharing clients' data. This paper delves into a class of federated problems characterized by non-convex and non-smooth loss functions, that are prevalent in FL applications but challenging to handle due to their intricate non-convexity and non-smoothness nature and the conflicting requirements on communication efficiency and privacy protection. In this paper, we propose a novel federated primal-dual algorithm with bidirectional model sparsification tailored for non-convex and non-smooth FL problems, and differential privacy is applied for strong privacy guarantee. Its unique insightful properties and some privacy and convergence analyses are also presented for the FL algorithm design guidelines. Extensive experiments on real-world data are conducted to demonstrate the effectiveness of the proposed algorithm and much superior performance than some state-of-the-art FL algorithms, together with the validation of all the analytical results and properties.


Maximum Volume Inscribed Ellipsoid: A New Simplex-Structured Matrix Factorization Framework via Facet Enumeration and Convex Optimization

arXiv.org Machine Learning

Consider a structured matrix factorization model where one factor is restricted to have its columns lying in the unit simplex. This simplex-structured matrix factorization (SSMF) model and the associated factorization techniques have spurred much interest in research topics over different areas, such as hyperspectral unmixing in remote sensing, topic discovery in machine learning, to name a few. In this paper we develop a new theoretical SSMF framework whose idea is to study a maximum volume ellipsoid inscribed in the convex hull of the data points. This maximum volume inscribed ellipsoid (MVIE) idea has not been attempted in prior literature, and we show a sufficient condition under which the MVIE framework guarantees exact recovery of the factors. The sufficient recovery condition we show for MVIE is much more relaxed than that of separable non-negative matrix factorization (or pure-pixel search); coincidentally it is also identical to that of minimum volume enclosing simplex, which is known to be a powerful SSMF framework for non-separable problem instances. We also show that MVIE can be practically implemented by performing facet enumeration and then by solving a convex optimization problem. The potential of the MVIE framework is illustrated by numerical results.


Identifiability of the Simplex Volume Minimization Criterion for Blind Hyperspectral Unmixing: The No Pure-Pixel Case

arXiv.org Machine Learning

In blind hyperspectral unmixing (HU), the pure-pixel assumption is well-known to be powerful in enabling simple and effective blind HU solutions. However, the pure-pixel assumption is not always satisfied in an exact sense, especially for scenarios where pixels are heavily mixed. In the no pure-pixel case, a good blind HU approach to consider is the minimum volume enclosing simplex (MVES). Empirical experience has suggested that MVES algorithms can perform well without pure pixels, although it was not totally clear why this is true from a theoretical viewpoint. This paper aims to address the latter issue. We develop an analysis framework wherein the perfect endmember identifiability of MVES is studied under the noiseless case. We prove that MVES is indeed robust against lack of pure pixels, as long as the pixels do not get too heavily mixed and too asymmetrically spread. The theoretical results are verified by numerical simulations.