Adversarial representation learning for synthetic replacement of private attributes

Martinsson, John, Zec, Edvin Listo, Gillblad, Daniel, Mogren, Olof

Oct-5-2020–arXiv.org Machine Learning

Data privacy is an increasingly important aspect of many real-world big data analytics tasks. Data sources that contain sensitive information may have immense potential which could be unlocked using privacy enhancing transformations, but current methods often fail to produce convincing output. Furthermore, finding the right balance between privacy and utility is often a tricky tradeoff. In this work, we propose a novel approach for data privatization, which involves two steps: in the first step, it removes the sensitive information, and in the second step, it replaces this information with an independent random sample. Our method builds on adversarial representation learning which ensures strong privacy by training the model to fool an increasingly strong adversary. While previous methods only aim at obfuscating the sensitive information, we find that adding new random information in its place strengthens the provided privacy and provides better utility at any given level of privacy. The result is an approach that can provide stronger privatization on image data, and yet be preserving both the domain and the utility of the inputs, entirely independent of the downstream task. Increasing capacity and performance of modern machine learning models lead to increasing amounts of data required for training them (Goodfellow et al., 2016). However, collecting and using large datasets which may contain sensitive information about individuals is often impeded by increasingly strong privacy laws protecting individual rights, and the infeasibility of obtaining individual consent.

baseline, big data, neural network, (22 more...)

arXiv.org Machine Learning

Oct-5-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.28)

Genre:
- Overview > Innovation (0.34)
- Research Report > Promising Solution (0.34)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning
      - Neural Networks (1.00)
      - Statistical Learning (0.68)
    - Vision (1.00)
  - Data Science > Data Mining
    - Big Data (0.69)
  - Security & Privacy (1.00)
  - Sensing and Signal Processing > Image Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found