fooling
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > Canada (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > Canada (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
Review for NeurIPS paper: Intra Order-preserving Functions for Calibration of Multi-Class Neural Networks
Additional Feedback: The proposed methods perform very strongly in ECE, slightly better than the state-of-the-art in NLL and slightly worse in classwise-ECE. It would be good to have some explanation about why ECE and classwise-ECE give so different results. As ECE studies the calibration of only the class with the highest predicted probability and ignores other class probabilities, does it mean that the proposed method is better than the state-of-the-art in top-1 probability but slightly weaker on other classes? In the appendix provided as supplemental material, at lines 739-742 it is claimed that ECE does not suffer from the same problem that is highlighted about classwise-ECE at lines 731-738. While this is technically correct, it misses the point. Actually, ECE also suffers from essentially the same problem.
Attack to Fool and Explain Deep Networks
Akhtar, Naveed, Jalwana, Muhammad A. A. K., Bennamoun, Mohammed, Mian, Ajmal
Deep visual models are susceptible to adversarial perturbations to inputs. Although these signals are carefully crafted, they still appear noise-like patterns to humans. This observation has led to the argument that deep visual representation is misaligned with human perception. We counter-argue by providing evidence of human-meaningful patterns in adversarial perturbations. We first propose an attack that fools a network to confuse a whole category of objects (source class) with a target label. Our attack also limits the unintended fooling by samples from non-sources classes, thereby circumscribing human-defined semantic notions for network fooling. We show that the proposed attack not only leads to the emergence of regular geometric patterns in the perturbations, but also reveals insightful information about the decision boundaries of deep models. Exploring this phenomenon further, we alter the `adversarial' objective of our attack to use it as a tool to `explain' deep visual representation. We show that by careful channeling and projection of the perturbations computed by our method, we can visualize a model's understanding of human-defined semantic notions. Finally, we exploit the explanability properties of our perturbations to perform image generation, inpainting and interactive image manipulation by attacking adversarialy robust `classifiers'.In all, our major contribution is a novel pragmatic adversarial attack that is subsequently transformed into a tool to interpret the visual models. The article also makes secondary contributions in terms of establishing the utility of our attack beyond the adversarial objective with multiple interesting applications.
- Education > Educational Setting > Higher Education (0.46)
- Information Technology > Security & Privacy (0.36)
Fooling Neural Network Interpretations via Adversarial Model Manipulation
Heo, Juyeon, Joo, Sunghwan, Moon, Taesup
We ask whether the neural network interpretation methods can be fooled via adversarial model manipulation, which is defined as a model fine-tuning step that aims to radically alter the explanations without hurting the accuracy of the original model. By incorporating the interpretation results directly in the regularization term of the objective function for fine-tuning, we show that the state-of-the-art interpreters, e.g., LRP and Grad-CAM, can be easily fooled with our model manipulation. We propose two types of fooling, passive and active, and demonstrate such foolings generalize well to the entire validation set as well as transfer to other interpretation methods. Our results are validated by both visually showing the fooled explanations and reporting quantitative metrics that measure the deviations from the original explanations. We claim that the stability of neural network interpretation method with respect to our adversarial model manipulation is an important criterion to check for developing robust and reliable neural network interpretation method.
- North America > United States (0.14)
- Europe (0.14)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
- Research Report > New Finding (0.48)
- Research Report > Promising Solution (0.34)
Denoising Autoencoders for Overgeneralization in Neural Networks
Despite the recent developments that allowed neural networks to achieve impressive performance on a variety of applications, these models are intrinsically affected by the problem of overgeneralization, due to their partitioning of the full input space into the fixed set of target classes used during training. Thus it is possible for novel inputs belonging to categories unknown during training or even completely unrecognizable to humans to fool the system into classifying them as one of the known classes, even with a high degree of confidence. Solving this problem may help improve the security of such systems in critical applications, and may further lead to applications in the context of open set recognition and 1-class recognition. This paper presents a novel way to compute a confidence score using denoising autoencoders and shows that such confidence score can correctly identify the regions of the input space close to the training distribution by approximately identifying its local maxima.
Fooling all the people all the time: the rise of artificial intelligence and fake news
Modern artificial intelligence is way beyond playing chess; it has mastered Go and kicks butt in Dota 2, among other games. What started as a test-lab monkey has evolved into something akin to a prodigy child. Artificial intelligence, or AI, may still have to be fed information, but once it has gathered enough, it can come up with results that mimic the original data. First came the static images -- AI managed to create perfectly convincing images of people who have never existed. Then it showed it was perfectly capable of mimicking different seasons.
Fooling all the people all the time: the rise of artificial intelligence and fake news
Modern artificial intelligence is way beyond playing chess; it has mastered Go and kicks butt in Dota 2, among other games. What started as a test-lab monkey has evolved into something akin to a prodigy child. Artificial intelligence, or AI, may still have to be fed information, but once it has gathered enough, it can come up with results that mimic the original data. First came the static images -- AI managed to create perfectly convincing images of people who have never existed. Then it showed it was perfectly capable of mimicking different seasons.
- Media > News (1.00)
- Education > Health & Safety > School Safety & Security > School Violence (0.33)
Deep Learning can be easily fooled
On a post I wrote last year, I talked about the fact that Deep Neural Network could not label a changed image correctly (e.g. Recently, a related result is shown by researchers from University of Wyoming and Cornell University. They produced images completely unrecognizable to human eyes (as shown in the right picture) while DNN will still label them to be familiar objects (such as cheetah/peacock/baseball/…) with 99.99% confidence. Researchers used one of the best Deep Neural Networks, the "AlexNet" trained on the 1.3-million-image ILSVRC 2012 ImageNet dataset, to achieve state-of-the-art performance, and "LeNet" model trained on the MNIST dataset to test if the result holds for other DNN architectures. "AlexNet" and "LeNet" are both provided by the Caffe Software package.