private model
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.94)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.94)
FedThief: Harming Others to Benefit Oneself in Self-Centered Federated Learning
--In federated learning, participants' uploaded model updates cannot be directly verified, leaving the system vulnerable to malicious attacks. Existing attack strategies have adversaries upload tampered model updates to degrade the global model's performance. In real-world scenarios, attackers are driven by self-centered motives: their goal is to gain a competitive advantage by developing a model that outperforms those of other participants, not merely to cause disruption. In this paper, we study a novel Self-Centered Federated Learning (SCFL) attack paradigm, in which attackers not only degrade the performance of the global model through attacks but also enhance their own models within the federated learning process. We propose a framework named FedThief, which degrades the performance of the global model by uploading modified content during the upload stage. At the same time, it enhances the private model's performance through divergence-aware ensemble techniques--where "divergence" quantifies the deviation between private and global models--that integrate global updates and local knowledge. Extensive experiments show that our method effectively degrades the global model performance while allowing the attacker to obtain an ensemble model that significantly outperforms the global model. N the field of machine learning, the quality and diversity of the training data are widely recognized as essential prerequisites for enabling models to generalize effectively to unseen data and perform reliably across a range of downstream tasks [1], [2]. These characteristics directly influence the learned model's empirical risk minimization, hypothesis space coverage, and robustness to distributional shifts [3], [4].
- Asia > China > Hubei Province > Wuhan (0.04)
- Europe > United Kingdom > England > Surrey > Guildford (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- Health & Medicine (0.94)
An Efficient Private GPT Never Autoregressively Decodes
Li, Zhengyi, Guan, Yue, Yang, Kang, Feng, Yu, Liu, Ning, Yu, Yu, Leng, Jingwen, Guo, Minyi
The wide deployment of the generative pre-trained transformer (GPT) has raised privacy concerns for both clients and servers. While cryptographic primitives can be employed for secure GPT inference to protect the privacy of both parties, they introduce considerable performance overhead.To accelerate secure inference, this study proposes a public decoding and secure verification approach that utilizes public GPT models, motivated by the observation that securely decoding one and multiple tokens takes a similar latency. The client uses the public model to generate a set of tokens, which are then securely verified by the private model for acceptance. The efficiency of our approach depends on the acceptance ratio of tokens proposed by the public model, which we improve from two aspects: (1) a private sampling protocol optimized for cryptographic primitives and (2) model alignment using knowledge distillation. Our approach improves the efficiency of secure decoding while maintaining the same level of privacy and generation quality as standard secure decoding. Experiments demonstrate a $2.1\times \sim 6.0\times$ speedup compared to standard decoding across three pairs of public-private models and different network conditions.
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- North America > Canada (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Concept Drift Detection using Ensemble of Integrally Private Models
Varshney, Ayush K., Torra, Vicenc
Deep neural networks (DNNs) are one of the most widely used machine learning algorithm. DNNs requires the training data to be available beforehand with true labels. This is not feasible for many real-world problems where data arrives in the streaming form and acquisition of true labels are scarce and expensive. In the literature, not much focus has been given to the privacy prospect of the streaming data, where data may change its distribution frequently. These concept drifts must be detected privately in order to avoid any disclosure risk from DNNs. Existing privacy models use concept drift detection schemes such ADWIN, KSWIN to detect the drifts. In this paper, we focus on the notion of integrally private DNNs to detect concept drifts. Integrally private DNNs are the models which recur frequently from different datasets. Based on this, we introduce an ensemble methodology which we call 'Integrally Private Drift Detection' (IPDD) method to detect concept drift from private models. Our IPDD method does not require labels to detect drift but assumes true labels are available once the drift has been detected. We have experimented with binary and multi-class synthetic and real-world data. Our experimental results show that our methodology can privately detect concept drift, has comparable utility (even better in some cases) with ADWIN and outperforms utility from different levels of differentially private models. The source code for the paper is available \hyperlink{https://github.com/Ayush-Umu/Concept-drift-detection-Using-Integrally-private-models}{here}.
- South America > Brazil > Maranhão (0.04)
- Europe > Sweden > Östergötland County > Linköping (0.04)
- Europe > Sweden > Västerbotten County > Umeå (0.04)
P4: Towards private, personalized, and Peer-to-Peer learning
Maheri, Mohammad Mahdi, Siby, Sandra, Abdollahi, Sina, Borovykh, Anastasia, Haddadi, Hamed
Personalized learning is a proposed approach to address the problem of data heterogeneity in collaborative machine learning. In a decentralized setting, the two main challenges of personalization are client clustering and data privacy. In this paper, we address these challenges by developing P4 (Personalized Private Peer-to-Peer) a method that ensures that each client receives a personalized model while maintaining differential privacy guarantee of each client's local dataset during and after the training. Our approach includes the design of a lightweight algorithm to identify similar clients and group them in a private, peer-to-peer (P2P) manner. Once grouped, we develop differentially-private knowledge distillation for clients to co-train with minimal impact on accuracy. We evaluate our proposed method on three benchmark datasets (FEMNIST or Federated EMNIST, CIFAR-10 and CIFAR-100) and two different neural network architectures (Linear and CNN-based networks) across a range of privacy parameters. The results demonstrate the potential of P4, as it outperforms the state-of-the-art of differential private P2P by up to 40 percent in terms of accuracy. We also show the practicality of P4 by implementing it on resource constrained devices, and validating that it has minimal overhead, e.g., about 7 seconds to run collaborative training between two clients.
- Europe > United Kingdom > England > Greater London > London (0.04)
- North America > United States > Virginia (0.04)
- Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
Personalized Federated Learning via Stacking
Federated Learning (FL) is an area of research that develops methods to allow multiple parties to collaboratively train machine learning models without exchanging data. First introduced in 2016 by McMahan et al. to allow a large number of edge devices to collaboratively train language models [1], FL has been successfully applied to several domains where for regulatory or privacy reasons models cannot be trained on centralized pooled data. Most FL approaches result in a single collaboratively trained global model that is used by every client for inference. Personalized Federated Learning (PFL) recognizes that in some non-IID contexts performance improvements are possible if each client somehow adapts or personalizes the global model to its data. Approaches range from clients fine-tuning the global model on private data to client clustering, and others discussed in Section 2. In this paper, we build on prior work [2] and explore a simple personalization approach that avoids training a global model which is then personalized. Instead, clients employ privacy-preserving techniques [3] to train a model on their data and make it public to the federation.
- Research Report (1.00)
- Overview (0.68)
On the Impact of Output Perturbation on Fairness in Binary Linear Classification
Emelianov, Vitalii, Perrot, Michaël
We theoretically study how differential privacy interacts with both individual and group fairness in binary linear classification. More precisely, we focus on the output perturbation mechanism, a classic approach in privacy-preserving machine learning. We derive high-probability bounds on the level of individual and group fairness that the perturbed models can achieve compared to the original model. Hence, for individual fairness, we prove that the impact of output perturbation on the level of fairness is bounded but grows with the dimension of the model. For group fairness, we show that this impact is determined by the distribution of so-called angular margins, that is signed margins of the non-private model re-scaled by the norm of each example.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Plymouth County > Hanover (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Africa > Rwanda > Kigali > Kigali (0.04)
Differential Privacy has Bounded Impact on Fairness in Classification
Mangold, Paul, Perrot, Michaël, Bellet, Aurélien, Tommasi, Marc
We theoretically study the impact of differential privacy on fairness in classification. We prove that, given a class of models, popular group fairness measures are pointwise Lipschitz-continuous with respect to the parameters of the model. This result is a consequence of a more general statement on accuracy conditioned on an arbitrary event (such as membership to a sensitive group), which may be of independent interest. We use this Lipschitz property to prove a non-asymptotic bound showing that, as the number of samples increases, the fairness level of private models gets closer to the one of their non-private counterparts. This bound also highlights the importance of the confidence margin of a model on the disparate impact of differential privacy.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (3 more...)
Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage
Huang, Catherine, Swoopes, Chelse, Xiao, Christina, Ma, Jiaqi, Lakkaraju, Himabindu
Machine learning models are increasingly utilized across impactful domains to predict individual outcomes. As such, many models provide algorithmic recourse to individuals who receive negative outcomes. However, recourse can be leveraged by adversaries to disclose private information. This work presents the first attempt at mitigating such attacks. We present two novel methods to generate differentially private recourse: Differentially Private Model (DPM) and Laplace Recourse (LR). Using logistic regression classifiers and real world and synthetic datasets, we find that DPM and LR perform well in reducing what an adversary can infer, especially at low FPR. When training dataset size is large enough, we find particular success in preventing privacy leakage while maintaining model and recourse accuracy with our novel LR method.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)