Universal Adversarial Perturbation for Text Classification

Oct-10-2019–arXiv.org Machine Learning

Given a state-of-the-art deep neural network text classifier, we show the existence of a universal and very small perturbation vector (in the embedding space) that causes natural text to be misclassified with high probability. Unlike images on which a single fixed-size adversarial perturbation can be found, text is of variable length, so we define the "universality" as "token-agnostic", where a single perturbation is applied to each token, resulting in different perturbations of flexible sizes at the sequence level. W e propose an algorithm to compute universal adversarial perturbations, and show that the state-of-the-art deep neural networks are highly vulnerable to them, even though they keep the neighborhood of tokens mostly preserved. W e also show how to use these adversarial perturbations to generate adversarial text samples. The surprising existence of universal "token-agnostic" adversarial perturbations may reveal important properties of a text classifier.

deep learning, neural network, perturbation, (18 more...)

arXiv.org Machine Learning

Oct-10-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > Maryland > Baltimore (0.15)

Genre:
- Research Report (0.50)

Industry:
- Information Technology > Security & Privacy (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.89)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found