Goto

Collaborating Authors

 Jang, Uyeong


Generating Semantic Adversarial Examples with Differentiable Rendering

arXiv.org Machine Learning

Machine learning (ML) algorithms, especially deep neural networks, have demonstrated success in several domains. However, several types of attacks have raised concerns about deploying ML in safety-critical domains, such as autonomous driving and security. An attacker perturbs a data point slightly in the concrete feature space (e.g., pixel space) and causes the ML algorithm to produce incorrect output (e.g. a perturbed stop sign is classified as a yield sign). These perturbed data points are called adversarial examples, and there are numerous algorithms in the literature for constructing adversarial examples and defending against them. In this paper we explore semantic adversarial examples (SAEs) where an attacker creates perturbations in the semantic space representing the environment that produces input for the ML model. For example, an attacker can change the background of the image to be cloudier to cause misclassification. We present an algorithm for constructing SAEs that uses recent advances in differential rendering and inverse graphics.


On Need for Topology Awareness of Generative Models

arXiv.org Machine Learning

Manifold assumption in learning states that: the data lie approximately on a manifold of much lower dimension than the input space. Generative models learn to generate data according to the underlying data distribution. Generative models are used in various tasks, such as data augmentation and generating variation of images. This paper addresses the following question: do generative models need to be aware of the topology of the underlying data manifold in which the data lie? This paper suggests that the answer is yes and demonstrates that these can have ramifications on security-critical applications, such as generative-model based defenses for adversarial examples. We provide theoretical and experimental results to support our claims.


Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training

arXiv.org Machine Learning

In this paper we study leveraging confidence information induced by adversarial training to reinforce adversarial robustness of a given adversarially trained model. A natural measure of confidence is $\|F({\bf x})\|_\infty$ (i.e. how confident $F$ is about its prediction?). We start by analyzing an adversarial training formulation proposed by Madry et al.. We demonstrate that, under a variety of instantiations, an only somewhat good solution to their objective induces confidence to be a discriminator, which can distinguish between right and wrong model predictions in a neighborhood of a point sampled from the underlying distribution. Based on this, we propose Highly Confident Near Neighbor (${\tt HCNN}$), a framework that combines confidence information and nearest neighbor search, to reinforce adversarial robustness of a base model. We give algorithms in this framework and perform a detailed empirical study. We report encouraging experimental results that support our analysis, and also discuss problems we observed with existing adversarial training.