collaborative machine learning
Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning
In collaborative machine learning(CML), multiple agents pool their resources(e.g., data) together for a common learning task. In realistic CML settings where the agents are self-interested and not altruistic, they may be unwilling to share data or model information without adequate rewards. Furthermore, as the data/model information shared by the agents may differ in quality, designing rewards which are fair to them is important so that they would not feel exploited nor discouraged from sharing. In this paper, we adopt federated learning as the CML paradigm, propose a novel cosine gradient Shapley value(CGSV) to fairly evaluate the expected marginal contribution of each agent's uploaded model parameter update/gradient without needing an auxiliary validation dataset, and based on the CGSV, design a novel training-time gradient reward mechanism with a fairness guarantee by sparsifying the aggregated parameter update/gradient downloaded from the server as reward to each agent such that its resulting quality is commensurate to that of the agent's uploaded parameter update/gradient. We empirically demonstrate the effectiveness of our fair gradient reward mechanism on multiple benchmark datasets in terms of fairness, predictive performance, and time overhead.
Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning
In collaborative machine learning(CML), multiple agents pool their resources(e.g., data) together for a common learning task. In realistic CML settings where the agents are self-interested and not altruistic, they may be unwilling to share data or model information without adequate rewards. Furthermore, as the data/model information shared by the agents may differ in quality, designing rewards which are fair to them is important so that they would not feel exploited nor discouraged from sharing. In this paper, we adopt federated learning as the CML paradigm, propose a novel cosine gradient Shapley value(CGSV) to fairly evaluate the expected marginal contribution of each agent's uploaded model parameter update/gradient without needing an auxiliary validation dataset, and based on the CGSV, design a novel training-time gradient reward mechanism with a fairness guarantee by sparsifying the aggregated parameter update/gradient downloaded from the server as reward to each agent such that its resulting quality is commensurate to that of the agent's uploaded parameter update/gradient. We empirically demonstrate the effectiveness of our fair gradient reward mechanism on multiple benchmark datasets in terms of fairness, predictive performance, and time overhead.
Federated Learning: Collaborative Machine Learning with a Tutorial on How to Get Started - KDnuggets
Federated learning, also known as collaborative learning, allows training models at scale on data that remains distributed on the devices where they are generated. Sensitive data remains with the owners of said data, where training is conducted, and a centralized training orchestrator of training only sees the contribution of each client through model updates. Federated learning doesn't guarantee privacy on its own (we'll touch on breaking and repairing privacy in federated learning systems later on), but it does make privacy possible. With the public and policy-makers becoming more aware of the data economy, demand for privacy-preserving machine learning is on the rise. As a result, data practices have been garnering increased scrutiny, and research on privacy-respecting tools like federated learning is increasingly active.
Collaborative Machine Learning with Incentive-Aware Model Rewards
Sim, Rachael Hwee Ling, Zhang, Yehong, Chan, Mun Choon, Low, Bryan Kian Hsiang
Collaborative machine learning (ML) is an appealing paradigm to build high-quality ML models by training on the aggregated data from many parties. However, these parties are only willing to share their data when given enough incentives, such as a guaranteed fair reward based on their contributions. This motivates the need for measuring a party's contribution and designing an incentive-aware reward scheme accordingly. This paper proposes to value a party's reward based on Shapley value and information gain on model parameters given its data. Subsequently, we give each party a model as a reward. To formally incentivize the collaboration, we define some desirable properties (e.g., fairness and stability) which are inspired by cooperative game theory but adapted for our model reward that is uniquely freely replicable. Then, we propose a novel model reward scheme to satisfy fairness and trade off between the desirable properties via an adjustable parameter. The value of each party's model reward determined by our scheme is attained by injecting Gaussian noise to the aggregated training data with an optimized noise variance. We empirically demonstrate interesting properties of our scheme and evaluate its performance using synthetic and real-world datasets.
Additively Homomorphical Encryption based Deep Neural Network for Asymmetrically Collaborative Machine Learning
The financial sector presents many opportunities to apply various machine learning techniques. Centralized machine learning creates a constraint which limits further applications in finance sectors. Data privacy is a fundamental challenge for a variety of finance and insurance applications that account on learning a model across different sections. In this paper, we define a new practical scheme of collaborative machine learning that one party owns data, but another party owns labels only, and term this \textbf{Asymmetrically Collaborative Machine Learning}. For this scheme, we propose a novel privacy-preserving architecture where two parties can collaboratively train a deep learning model efficiently while preserving the privacy of each party's data. More specifically, we decompose the forward propagation and backpropagation of the neural network into four different steps and propose a novel protocol to handle information leakage in these steps. Our extensive experiments on different datasets demonstrate not only stable training without accuracy loss, but also more than 100 times speedup compared with the state-of-the-art system.
Federated Machine Learning - Collaborative Machine Learning without Centralised Training Data
Federated machine learning was developed by McMahan et al who principally developed it for mobile devices such as cell phones to train a local model. These local models are aggregated centrally into a final central model. The original paper produced by McMahan presented a number of experiments on image classification, and language models where they used 2000 individual local models each were generated with small amounts of data. The image classification task they undertook was CIFAR-10 image classification challenge, and the language model task produced an LM of Shakespear's plays. The image classification task gained an accuracy of 99% using a Convolutional Neural Network in a Federated strategy.