Unlearnable Examples: Making Personal Data Unexploitable

Huang, Hanxun, Ma, Xingjun, Erfani, Sarah Monazam, Bailey, James, Wang, Yisen

Jan-13-2021–arXiv.org Machine Learning

The volume of "free" data on the internet has been key to the current success of deep learning. However, it also raises privacy concerns about the unauthorized exploitation of personal data for training commercial models. It is thus crucial to develop methods to prevent unauthorized data exploitation. This paper raises the question: can data be made unlearnable for deep learning models? We present a type of error-minimizing noise that can indeed make training examples unlearnable. Error-minimizing noise is intentionally generated to reduce the error of one or more of the training example(s) close to zero, which can trick the model into believing there is "nothing" to learn from these example(s). The noise is restricted to be imperceptible to human eyes, and thus does not affect normal data utility. We empirically verify the effectiveness of error-minimizing noise in both samplewise and class-wise forms. We also demonstrate its flexibility under extensive experimental settings and practicability in a case study of face recognition. Our work establishes an important first step towards making personal data unexploitable to deep learning models. In recent years, deep learning has had groundbreaking successes in several fields, such as computer vision (He et al., 2016) and natural language processing (Devlin et al., 2018). This is partly attributed to the availability of large-scale datasets crawled freely from the Internet such as ImageNet (Russakovsky et al., 2015) and ReCoRD (Zhang et al., 2018b). Whilst these datasets provide a playground for developing deep learning models, a concerning fact is that some datasets were collected without mutual consent (Prabhu & Birhane, 2020). Personal data has also been unconsciously collected from the Internet and used for training commercial models (Hill, 2020). This has raised public concerns about the "free" exploration of personal data for unauthorized or even illegal purposes.

deep learning, neural network, noise, (20 more...)

arXiv.org Machine Learning

Jan-13-2021

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia > Victoria (0.14)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found