Rewriting Meaningful Sentences via Conditional BERT Sampling and an application on fooling text classifiers

Xu, Lei, Ramirez, Ivan, Veeramachaneni, Kalyan

Oct-22-2020–arXiv.org Artificial Intelligence

Most adversarial attack methods that are designed to deceive a text classifier change the text classifier's prediction by modifying a few words or characters. Few try to attack classifiers by rewriting a whole sentence, due to the difficulties inherent in sentence-level rephrasing as well as the problem of setting the criteria for legitimate rewriting. In this paper, we explore the problem of creating adversarial examples with sentence-level rewriting. We design a new sampling method, named ParaphraseSampler, to efficiently rewrite the original sentence in multiple ways. Then we propose a new criteria for modification, called a sentence-level threaten model. This criteria allows for both word- and sentence-level changes, and can be adjusted independently in two dimensions: semantic similarity and grammatical quality. Experimental results show that many of these rewritten sentences are misclassified by the classifier. On all 6 datasets, our ParaphraseSampler achieves a better attack success rate than our baseline.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Oct-22-2020

arXiv.org PDF

Add feedback

Country:
- Europe > Czechia (0.05)
- North America > United States
  - New York (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Asia
  - Middle East > Republic of Türkiye (0.05)
  - Japan > Honshū
    - Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre:
- Research Report (0.70)

Industry:
- Government (0.66)
- Leisure & Entertainment > Sports
  - Football (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (0.67)
  - Machine Learning > Neural Networks
    - Deep Learning (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found