to

### black-box attack

We introduce OPtical ADversarial attack (OPAD). OPAD is an adversarial attack in the physical space aiming to fool image classifiers without physically touching the objects (e.g., moving or painting the objects). The principle of OPAD is to use structured illumination to alter the appearance of the target objects. The system consists of a low-cost projector, a camera, and a computer. The challenge of the problem is the non-linearity of the radiometric response of the projector and the spatially varying spectral response of the scene. Attacks generated in a conventional approach do not work in this setting unless they are calibrated to compensate for such a projector-camera model. The proposed solution incorporates the projector-camera model into the adversarial attack optimization, where a new attack formulation is derived. Experimental results prove the validity of the solution. It is demonstrated that OPAD can optically attack a real 3D object in the presence of background lighting for white-box, black-box, targeted, and untargeted attacks. Theoretical analysis is presented to quantify the fundamental performance limit of the system.

### Boosting Transferability of Targeted Adversarial Examples via Hierarchical Generative Networks

Transfer-based adversarial attacks can effectively evaluate model robustness in the black-box setting. Though several methods have demonstrated impressive transferability of untargeted adversarial examples, targeted adversarial transferability is still challenging. The existing methods either have low targeted transferability or sacrifice computational efficiency. In this paper, we develop a simple yet practical framework to efficiently craft targeted transfer-based adversarial examples. Specifically, we propose a conditional generative attacking model, which can generate the adversarial examples targeted at different classes by simply altering the class embedding and share a single backbone. Extensive experiments demonstrate that our method improves the success rates of targeted black-box attacks by a significant margin over the existing methods -- it reaches an average success rate of 29.6\% against six diverse models based only on one substitute white-box model in the standard testing of NeurIPS 2017 competition, which outperforms the state-of-the-art gradient-based attack methods (with an average success rate of $<$2\%) by a large margin. Moreover, the proposed method is also more efficient beyond an order of magnitude than gradient-based methods.

### Imperceptible Adversarial Examples for Fake Image Detection

Fooling people with highly realistic fake images generated with Deepfake or GANs brings a great social disturbance to our society. Many methods have been proposed to detect fake images, but they are vulnerable to adversarial perturbations -- intentionally designed noises that can lead to the wrong prediction. Existing methods of attacking fake image detectors usually generate adversarial perturbations to perturb almost the entire image. This is redundant and increases the perceptibility of perturbations. In this paper, we propose a novel method to disrupt the fake image detection by determining key pixels to a fake image detector and attacking only the key pixels, which results in the $L_0$ and the $L_2$ norms of adversarial perturbations much less than those of existing works. Experiments on two public datasets with three fake image detectors indicate that our proposed method achieves state-of-the-art performance in both white-box and black-box attacks.

Automatic modulation classification can be a core component for intelligent spectrally efficient wireless communication networks, and deep learning techniques have recently been shown to deliver superior performance to conventional model-based strategies, particularly when distinguishing between a large number of modulation types. However, such deep learning techniques have also been recently shown to be vulnerable to gradient-based adversarial attacks that rely on subtle input perturbations, which would be particularly feasible in a wireless setting via jamming. One such potent attack is the one known as the Carlini-Wagner attack, which we consider in this work. We further consider a data-driven subsampling setting, where several recently introduced deep-learning-based algorithms are employed to select a subset of samples that lead to reducing the final classifier's training time with minimal loss in accuracy. In this setting, the attacker has to make an assumption about the employed subsampling strategy, in order to calculate the loss gradient. Based on state of the art techniques available to both the attacker and defender, we evaluate best strategies under various assumptions on the knowledge of the other party's strategy. Interestingly, in presence of knowledgeable attackers, we identify computational cost reduction opportunities for the defender with no or minimal loss in performance.

### A Robust Adversarial Network-Based End-to-End Communications System With Strong Generalization Ability Against Adversarial Attacks

We propose a novel defensive mechanism based on a generative adversarial network (GAN) framework to defend against adversarial attacks in end-to-end communications systems. Specifically, we utilize a generative network to model a powerful adversary and enable the end-to-end communications system to combat the generative attack network via a minimax game. We show that the proposed system not only works well against white-box and black-box adversarial attacks but also possesses excellent generalization capabilities to maintain good performance under no attacks. We also show that our GAN-based end-to-end system outperforms the conventional communications system and the end-to-end communications system with/without adversarial training.

### Black-box Adversarial Attacks in Autonomous Vehicle Technology

Despite the high quality performance of the deep neural network in real-world applications, they are susceptible to minor perturbations of adversarial attacks. This is mostly undetectable to human vision. The impact of such attacks has become extremely detrimental in autonomous vehicles with real-time "safety" concerns. The black-box adversarial attacks cause drastic misclassification in critical scene elements such as road signs and traffic lights leading the autonomous vehicle to crash into other vehicles or pedestrians. In this paper, we propose a novel query-based attack method called Modified Simple black-box attack (M-SimBA) to overcome the use of a white-box source in transfer based attack method. Also, the issue of late convergence in a Simple black-box attack (SimBA) is addressed by minimizing the loss of the most confused class which is the incorrect class predicted by the model with the highest probability, instead of trying to maximize the loss of the correct class. We evaluate the performance of the proposed approach to the German Traffic Sign Recognition Benchmark (GTSRB) dataset. We show that the proposed model outperforms the existing models like Transfer-based projected gradient descent (T-PGD), SimBA in terms of convergence time, flattening the distribution of confused class probability, and producing adversarial samples with least confidence on the true class.

### SurFree: a fast surrogate-free black-box attack

Machine learning classifiers are critically prone to evasion attacks. Adversarial examples are slightly modified inputs that are then misclassified, while remaining perceptively close to their originals. Last couple of years have witnessed a striking decrease in the amount of queries a black box attack submits to the target classifier, in order to forge adversarials. This particularly concerns the black-box score-based setup, where the attacker has access to top predicted probabilites: the amount of queries went from to millions of to less than a thousand. This paper presents SurFree, a geometrical approach that achieves a similar drastic reduction in the amount of queries in the hardest setup: black box decision-based attacks (only the top-1 label is available). We first highlight that the most recent attacks in that setup, HSJA, QEBA and GeoDA all perform costly gradient surrogate estimations. SurFree proposes to bypass these, by instead focusing on careful trials along diverse directions, guided by precise indications of geometrical properties of the classifier decision boundaries. We motivate this geometric approach before performing a head-to-head comparison with previous attacks with the amount of queries as a first class citizen. We exhibit a faster distortion decay under low query amounts (few hundreds to a thousand), while remaining competitive at higher query budgets.

### Learning Black-Box Attackers with Transferable Priors and Query Feedback

This paper addresses the challenging black-box adversarial attack problem, where only classification confidence of a victim model is available. Inspired by consistency of visual saliency between different vision models, a surrogate model is expected to improve the attack performance via transferability. By combining transferability-based and query-based black-box attack, we propose a surprisingly simple baseline approach (named SimBA++) using the surrogate model, which significantly outperforms several state-of-the-art methods. Moreover, to efficiently utilize the query feedback, we update the surrogate model in a novel learning scheme, named High-Order Gradient Approximation (HOGA). By constructing a high-order gradient computation graph, we update the surrogate model to approximate the victim model in both forward and backward pass. The SimBA++ and HOGA result in Learnable Black-Box Attack (LeBA), which surpasses previous state of the art by considerable margins: the proposed LeBA significantly reduces queries, while keeping higher attack success rates close to 100% in extensive ImageNet experiments, including attacking vision benchmarks and defensive models. Code is open source at https://github.com/TrustworthyDL/LeBA.

### Exploiting Vulnerabilities of Deep Learning-based Energy Theft Detection in AMI through Adversarial Attacks

Effective detection of energy theft can prevent revenue losses of utility companies and is also important for smart grid security. In recent years, enabled by the massive fine-grained smart meter data, deep learning (DL) approaches are becoming popular in the literature to detect energy theft in the advanced metering infrastructure (AMI). However, as neural networks are shown to be vulnerable to adversarial examples, the security of the DL models is of concern. In this work, we study the vulnerabilities of DL-based energy theft detection through adversarial attacks, including single-step attacks and iterative attacks. From the attacker's point of view, we design the \textit{SearchFromFree} framework that consists of 1) a randomly adversarial measurement initialization approach to maximize the stolen profit and 2) a step-size searching scheme to increase the performance of black-box iterative attacks. The evaluation based on three types of neural networks shows that the adversarial attacker can report extremely low consumption measurements to the utility without being detected by the DL models. We finally discuss the potential defense mechanisms against adversarial attacks in energy theft detection.