Learning to Noise: Application-Agnostic Data Sharing with Local Differential Privacy
Mansbridge, Alex, Barbour, Gregory, Piras, Davide, Frye, Christopher, Feige, Ilya, Barber, David
In recent years, the collection and sharing of individuals' private data has become commonplace in many industries. Local differential privacy (LDP) is a rigorous approach which uses a randomized algorithm to preserve privacy even from the database administrator, unlike the more standard central differential privacy. For LDP, when applying noise directly to high-dimensional data, the level of noise required all but entirely destroys data utility. In this paper we introduce a novel, application-agnostic privatization mechanism that leverages representation learning to overcome the prohibitive noise requirements of direct methods, while maintaining the strict guarantees of LDP. We further demonstrate that this privatization mechanism can be used to train machine learning algorithms across a range of applications, including private data collection, private novel-class classification, and the augmentation of clean datasets with additional privatized features. We achieve significant gains in performance on downstream classification tasks relative to benchmarks that noise the data directly, which are state-of-the-art in the context of application-agnostic LDP mechanisms for high-dimensional data. The collection of personal data is ubiquitous, and unavoidable for many in everyday life. While this has undeniably improved the quality and user experience of many products and services, evidence of data misuse and data breaches (Sweeney, 1997; Jolly, 2020) have brought the concept of data privacy into sharp focus, fueling both regulatory changes as well as a shift in personal preferences. The onus has now fallen on organizations to determine if they are willing and able to collect personal data under these changing expectations.
Oct-23-2020