Matrix Factorization has been very successful in practical recommendation applications and e-commerce. Due to data shortage and stringent regulations, it can be hard to collect sufficient data to build performant recommender systems for a single company. Federated learning provides the possibility to bridge the data silos and build machine learning models without compromising privacy and security. Participants sharing common users or items collaboratively build a model over data from all the participants. There have been some works exploring the application of federated learning to recommender systems and the privacy issues in collaborative filtering systems. However, the privacy threats in federated matrix factorization are not studied. In this paper, we categorize federated matrix factorization into three types based on the partition of feature space and analyze privacy threats against each type of federated matrix factorization model. We also discuss privacy-preserving approaches. As far as we are aware, this is the first study of privacy threats of the matrix factorization method in the federated learning framework.
The five interns all look up. Brad, a burly caucasian jock, waves hello overenthusiastically. Kai, a nonbinary Japanese-American hacker, plays with a Rubix cube. Devi, a bubbly Indian-American networker, snaps a selfie. Mateo, a scrawny Hispanic bookworm, pauses in the middle of eating a sandwich. Aliyah, a sharply-dressed African-American security enthusiast, looks unimpressed.
TensorFlow Federated (TFF) is an open-source framework for machine learning and other computations on decentralized data. TFF has been developed to facilitate open research and experimentation with Federated Learning (FL), an approach to machine learning where a shared global model is trained across many participating clients that keep their training data locally. By eliminating the need to collect data at a central location, yet still enabling each participant to benefit from the collective knowledge of everything in the network, FL lets you build intelligent applications that leverage insights from data that might be too costly, sensitive, or impractical to collect. In this session, we explain the key concepts behind FL and TFF, how to set up a FL experiment and run it in a simulator, what the code looks like and how to extend it, and we briefly discuss options for future deployment to real devices.
Advancements in the power of machine learning have brought with them major data privacy concerns. This is especially true when it comes to training machine learning models with data obtained from the interaction of users with devices such as smartphones. So the big question is, how do we train and improve these on-device machine learning models without sharing personally-identifiable data? That is the question that we'll seek to answer in this look at a technique known as federated learning. The traditional process for training a machine learning model involves uploading data to a server and using that to train models.
Federated Learning is an emerging privacy-preserving distributed machine learning approach to building a shared model by performing distributed training locally on participating devices (clients) and aggregating the local models into a global one. As this approach prevents data collection and aggregation, it helps in reducing associated privacy risks to a great extent. However, the data samples across all participating clients are usually not independent and identically distributed (noni.i.d.), and Out of Distribution (OOD) generalization for the learned models can be poor. Besides this challenge, federated learning also remains vulnerable to various attacks on security wherein a few malicious participating entities work towards inserting backdoors, degrading the generated aggregated model as well as inferring the data owned by participating entities. In this paper, we propose an approach for learning invariant (causal) features common to all participating clients in a federated learning setup and analyse empirically how it enhances the Out of Distribution (OOD) accuracy as well as the privacy of the final learned model.