Learning to Generate Image Embeddings with User-level Differential Privacy

Xu, Zheng, Collins, Maxwell, Wang, Yuxiao, Panait, Liviu, Oh, Sewoong, Augenstein, Sean, Liu, Ting, Schroff, Florian, McMahan, H. Brendan

Mar-31-2023–arXiv.org Artificial Intelligence

Representation learning, by training deep neural networks as feature extractors to generate compact embedding vectors from images, is a fundamental component in computer vision. Metric learning, a kind of representation learning using supervised data, has been widely applied to image recognition, clustering, and retrieval [Schroff et al., 2015; Weinberger and Saul, 2009; Weyand et al., 2020]. Machine learning models have the capacity to memorize training data [Carlini et al., 2019, 2021], leading to privacy risks when the models are deployed. Privacy risk can also be audited by membership inference attacks [Carlini et al., 2022; Shokri et al., 2017], i.e. detecting whether certain data was used to train a model and potentially exposing users' usage behaviors. Defending against such risks is a critical responsibility when training on privacy-sensitive data. Differential Privacy (DP) [Dwork et al., 2006] is an extensively used quantifiable measurement of privacy risk, now generally accepted as a standard notion of privacy in both industry and government [Apple Privacy Team, 2017; Ding et al., 2017; McMahan and Thakurta, 2022; US Census Bureau, 2021]. Applied to machine learning, DP requires a training procedure with explicit randomness, and guarantees that the distribution over output models is quantifiably similar given a certain scope of change to the training dataset. A DP guarantee with respect to the change of a single arbitrary training example is known as example-level DP, which provides plausible deniability (in the binary hypothesis testing sense of [Kairouz et al., 2015]) that any single example (e.g., image) occurred The first two authors contributed equally.

artificial intelligence, machine learning, virtual client, (16 more...)

arXiv.org Artificial Intelligence

Mar-31-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.54)

Genre:
- Research Report (1.00)

Industry:
- Government > Regional Government
  - North America Government > United States Government (0.54)
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Inductive Learning (0.86)
  - Neural Networks > Deep Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found