Free-riders in Federated Learning: Attacks and Defenses
Lin, Jierui, Du, Min, Liu, Jian
Free-riders in Federated Learning: Attacks and Defenses Jierui Lin, Min Du, and Jian Liu University of California, Berkeley Abstract--Federated learning is a recently proposed paradigm that enables multiple clients to collaboratively train a joint model. It allows clients to train models locally, and leverages the parameter server to generate a global model by aggregating the locally submitted gradient updates at each round. Although the incentive model for federated learning has not been fully developed, it is supposed that participants are able to get rewards or the privilege to use the final global model, as a compensation for taking efforts to train the model. Therefore, a client who does not have any local data has the incentive to construct local gradient updates in order to deceive for rewards. In this paper, we are the first to propose the notion of free rider attacks, to explore possible ways that an attacker may construct gradient updates, without any local training data. Furthermore, we explore possible defenses that could detect the proposed attacks, and propose a new high dimensional detection method called STD-DAGMM, which particularly works well for anomaly detection of model parameters. We extend the attacks and defenses to consider more free riders as well as differential privacy, which sheds light on and calls for future research in this field. I NTRODUCTION F EDERA TED learning [1], [2], [3] has been proposed to facilitate a joint model training leveraging data from multiple clients, where the training process is coordinated by a parameter server. In the whole process, clients' data stay local, and only model parameters are communicated among clients through the parameter server. A typical training iteration works as follows. First, the parameter server sends the newest global model to each client. Then, each client locally updates the model using local data and reports updated gradients to the parameter server. Finally, the server performs model aggregation on all submitted local updates to form a new global model, which has better performance than models trained using any single client's data. Compared with an alternative approach which simply collects all data from the clients and trains a model on those data, federated learning is able to save the communication overhead by only transmitting model parameters, as well as protect privacy since all data stay local.
Nov-28-2019
- Country:
- North America > United States
- California > Alameda County
- Berkeley (0.34)
- Massachusetts > Suffolk County
- Boston (0.04)
- Utah (0.04)
- California > Alameda County
- North America > United States
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Education (1.00)
- Information Technology > Security & Privacy (1.00)
- Technology: