Protecting Classifiers From Attacks. A Bayesian Approach

Gallego, Victor, Naveiro, Roi, Redondo, Alberto, Insua, David Rios, Ruggeri, Fabrizio

Apr-18-2020–arXiv.org Machine Learning

Over this decade, an increasing number of processes is being automated through classification algorithms, being essential that these are robust and reliable if we are to trust key operations based on their output. State-of-the-art classifiers perform extraordinarily well on standard data, but they have been shown to be vulnerable to adversarial examples, data instances specifically targeted at fooling the algorithms (Comiter, 2019). As a fundamental hypothesis, algorithms rely on the use of independent and identically distributed (iid) data for both the training and test phases. However, security aspects in classification, which form part of the field of adversarial machine learning (AML), question such hypothesis due to the presence of adversaries ready to modify the data to obtain a benefit and, thus, making both distributions differ. Stemming from the pioneering work in adversarial classification (AC) in Dalvi et al. (2004), the paradigm used to model the confrontation between adversaries and classification systems has been game theory, see recent reviews in Biggio and Roli (2018) and Zhou et al. (2018). As an example, the most popular attacks, including the fast gradient sign method (FGSM) (Goodfellow et al., 2014b), may be viewed from a game-theoretic perspective. Similarly, two of the most promising defence techniques, adversarial training (AT) (Madry et al., 2018), which trains the defender model with attacked samples, and adversarial logit pairing (ALP) (Kannan et al., 2018), which encourages the logits of the model to be the same for both standard and adversarial inputs, may be framed in game theoretic terms. This perspective typically entails common knowledge hypothesis (Hargreaves-Heap and Varoufakis, 2004) which, from a fundamental point of view, are not sustainable in settings such as security, as adversaries try to hide and conceal information. Recent work (Naveiro et al., 2019) presented ACRA, a novel approach for AC based on Adversarial Risk

attacker, classification, classifier, (15 more...)

arXiv.org Machine Learning

Apr-18-2020

arXiv.org PDF

Add feedback

Country:
- Europe (0.14)
- North America > United States
  - Ohio (0.04)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Uncertainty
      - Bayesian Inference (1.00)
    - Machine Learning
      - Statistical Learning (1.00)
      - Neural Networks > Deep Learning (0.93)
      - Learning Graphical Models > Directed Networks
        Bayesian Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found