Benchmarking Semi-supervised Federated Learning

Zhang, Zhengming, Yao, Zhewei, Yang, Yaoqing, Yan, Yujun, Gonzalez, Joseph E., Mahoney, Michael W.

arXiv.org Machine Learning 

Current state-of-the-art machine learning models can potentially benefit from the large amount of user data privately-held on mobile devices, as well as the computing power locally-available on these devices. In response to this, federated learning (FL), which only requires transmitting the trained (intermediate) models, has been proposed as a privacy-preserving solution to exploit the data and computing power on mobile devices [1, 2]. In a typical FL pipeline, a server maintains a model and shares it with users/devices. Each user/device updates the global shared model for multiple steps locally using only locally-held data, and then it uploads the updated model back to the server. After aggregating all the models from users, the server takes an averaging step over all the models (e.g., FedAvg [2]), and it then sends the averaged model back to users [1, 3]. This approach respects privacy in the (weak) sense that the server does not access the private user data at any point in the procedure. However, prior work in FL has made the unrealistic assumption that the data stored on the local device are fully annotated with ground-truth labels and that the server does not have access to any labeled data. In fact, the private data at the local device are more often unlabeled, since annotating data requires both time and domain knowledge [4, 5], and servers are often hosted by organizations that do have labeled data.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found