Goto

Collaborating Authors

 Triastcyn, Aleksei


DP-REC: Private & Communication-Efficient Federated Learning

arXiv.org Machine Learning

Privacy and communication efficiency are important challenges in federated training of neural networks, and combining them is still an open problem. In this work, we develop a method that unifies highly compressed communication and differential privacy (DP). We introduce a compression technique based on Relative Entropy Coding (REC) to the federated setting. With a minor modification to REC, we obtain a provably differentially private learning algorithm, DP-REC, and show how to compute its privacy guarantees. Our experiments demonstrate that DP-REC drastically reduces communication costs while providing privacy guarantees comparable to the state-of-the-art. The performance of modern neural-network-based machine learning models scales exceptionally well with the amount of data that they are trained on (Kaplan et al., 2020; Henighan et al., 2020). At the same time, industry (Xiao & Karlin), legislators (Dwork, 2019; Voigt & Von dem Bussche, 2017) and consumers (Laziuk, 2021) have become more conscious about the need to protect the privacy of the data that might be used in training such models. Federated learning (FL) describes a machine learning principle that enables learning on decentralized data by computing updates on-device. Instead of sending its data to a central location, a "client" in a federation of devices sends model updates computed on its data to the central server. Such an approach to learning from decentralized data promises to unlock the computing capabilities of billions of edge devices, enable personalized models and new applications in e.g. On the other hand, the federated paradigm brings challenges along many dimensions such as learning from non-i.i.d. Neural network training requires many passes over the data, resulting in repeated transfer of the model and updates between the server and the clients, potentially making communication a primary bottleneck (Kairouz et al., 2019; Wang et al., 2021). Compressing updates is an active area of research in FL and an essential step in "untethering" edge devices from WiFi.


Differential Privacy Meets Maximum-weight Matching

arXiv.org Artificial Intelligence

When it comes to large-scale multi-agent systems with a diverse set of agents, traditional differential privacy (DP) mechanisms are ill-matched because they consider a very broad class of adversaries, and they protect all users, independent of their characteristics, by the same guarantee. Achieving a meaningful privacy leads to pronounced reduction in solution quality. Such assumptions are unnecessary in many real-world applications for three key reasons: (i) users might be willing to disclose less sensitive information (e.g., city of residence, but not exact location), (ii) the attacker might posses auxiliary information (e.g., city of residence in a mobility-on-demand system, or reviewer expertise in a paper assignment problem), and (iii) domain characteristics might exclude a subset of solutions (an expert on auctions would not be assigned to review a robotics paper, thus there is no need for indistinguishably between reviewers on different fields). We introduce Piecewise Local Differential Privacy (PLDP), a privacy model designed to protect the utility function in applications where the attacker possesses additional information on the characteristics of the utility space. PLDP enables a high degree of privacy, while being applicable to real-world, unboundedly large settings. Moreover, we propose PALMA, a privacy-preserving heuristic for maximum-weight matching. We evaluate PALMA in a vehicle-passenger matching scenario using real data and demonstrate that it provides strong privacy, $\varepsilon \leq 3$ and a median of $\varepsilon = 0.44$, and high quality matchings ($10.8\%$ worse than the non-private optimal).


Improved Accounting for Differentially Private Learning

arXiv.org Machine Learning

We consider the problem of differential privacy accounting, i.e. estimation of privacy loss bounds, in machine learning in a broad sense. We propose two versions of a generic privacy accountant suitable for a wide range of learning algorithms. Both versions are derived in a simple and principled way using well-known tools from probability theory, such as concentration inequalities. We demonstrate that our privacy accountant is able to achieve state-of-the-art estimates of DP guarantees and can be applied to new areas like variational inference. Moreover, we show that the latter enjoys differential privacy at minor cost.


Generating Differentially Private Datasets Using GANs

arXiv.org Machine Learning

In this paper, we present a technique for generating artificial datasets that retain statistical properties of the real data while providing differential privacy guarantees with respect to this data. We include a Gaussian noise layer in the discriminator of a generative adversarial network to make the output and the gradients differentially private with respect to the training data, and then use the generator component to synthesise privacy-preserving artificial dataset. Our experiments show that under a reasonably small privacy budget we are able to generate data of high quality and successfully train machine learning models on this artificial data.