Goto

Collaborating Authors

 Sabokrou, Mohammad


Scanning Trojaned Models Using Out-of-Distribution Samples

arXiv.org Artificial Intelligence

Scanning for trojan (backdoor) in deep neural networks is crucial due to their significant real-world applications. There has been an increasing focus on developing effective general trojan scanning methods across various trojan attacks. Despite advancements, there remains a shortage of methods that perform effectively without preconceived assumptions about the backdoor attack method. Additionally, we have observed that current methods struggle to identify classifiers trojaned using adversarial training. Motivated by these challenges, our study introduces a novel scanning method named TRODO (TROjan scanning by Detection of adversarial shifts in Out-of-distribution samples). TRODO leverages the concept of "blind spots"--regions where trojaned classifiers erroneously identify out-of-distribution (OOD) samples as in-distribution (ID). We scan for these blind spots by adversarially shifting OOD samples towards in-distribution. The increased likelihood of perturbed OOD samples being classified as ID serves as a signature for trojan detection. TRODO is both trojan and label mapping agnostic, effective even against adversarially trained trojaned classifiers. It is applicable even in scenarios where training data is absent, demonstrating high accuracy and adaptability across various scenarios and datasets, highlighting its potential as a robust trojan scanning strategy.


Killing it with Zero-Shot: Adversarially Robust Novelty Detection

arXiv.org Artificial Intelligence

Novelty Detection (ND) plays a crucial role in machine learning by identifying new or unseen data during model inference. This capability is especially important for the safe and reliable operation of automated systems. Despite advances in this field, existing techniques often fail to maintain their performance when subject to adversarial attacks. Our research addresses this gap by marrying the merits of nearest-neighbor algorithms with robust features obtained from models pretrained on ImageNet. We focus on enhancing the robustness and performance of ND algorithms. Experimental results demonstrate that our approach significantly outperforms current state-of-the-art methods across various benchmarks, particularly under adversarial conditions. By incorporating robust pretrained features into the k-NN algorithm, we establish a new standard for performance and robustness in the field of robust ND. This work opens up new avenues for research aimed at fortifying machine learning systems against adversarial vulnerabilities. Our implementation is publicly available at https://github.com/rohban-lab/ZARND.


Continual Learning: Less Forgetting, More OOD Generalization via Adaptive Contrastive Replay

arXiv.org Artificial Intelligence

Split CIFAR-100 Split Mini-ImageNet Split Tiny-ImageNet 20 15 10 5 0 GEM A-GEM ER GSS GDUMB HAL MetaSP SOIF Ours Figure 1: Evaluating Out-of-Distribution (OOD) Generalization Capability: The performance of state-of-the-art rehearsal-based methods on the Split CIFAR-100, Split Mini-ImageNet, and Split Tiny-ImageNet datasets significantly drops on OOD samples, highlighting their lack of generalization. In this paper, we address this challenge by proposing a method that consistently outperforms existing approaches across all datasets. Machine learning models often suffer from catastrophic forgetting of previously learned knowledge when learning new classes. Various methods have been proposed to mitigate this issue. However, rehearsal-based learning, which retains samples from previous classes, typically achieves good performance but tends to memorize specific instances, struggling with Out-of-Distribution (OOD) generalization. This often leads to high forgetting rates and poor generalization. Surprisingly, the OOD generalization capabilities of these methods have been largely unexplored. In this paper, we highlight this issue and propose a simple yet effective strategy inspired by contrastive learning and data-centric principles to address it. We introduce Adaptive Contrastive Replay (ACR), a method that employs dual optimization to simultaneously train both the encoder and the classifier. ACR adaptively populates the replay buffer with misclassified samples while ensuring a balanced representation of classes and tasks. By refining the decision boundary in this way, ACR achieves a balance between stability and plasticity.


Class-Adaptive Sampling Policy for Efficient Continual Learning

arXiv.org Artificial Intelligence

Continual learning (CL) aims to acquire new knowledge while preserving information from previous experiences without forgetting. Though buffer-based methods (i.e., retaining samples from previous tasks) have achieved acceptable performance, determining how to allocate the buffer remains a critical challenge. Most recent research focuses on refining these methods but often fails to sufficiently consider the varying influence of samples on the learning process, and frequently overlooks the complexity of the classes/concepts being learned. Generally, these methods do not directly take into account the contribution of individual classes. However, our investigation indicates that more challenging classes necessitate preserving a larger number of samples compared to less challenging ones. To address this issue, we propose a novel method and policy named 'Class-Adaptive Sampling Policy' (CASP), which dynamically allocates storage space within the buffer. By utilizing concepts of class contribution and difficulty, CASP adaptively manages buffer space, allowing certain classes to occupy a larger portion of the buffer while reducing storage for others. This approach significantly improves the efficiency of knowledge retention and utilization. CASP provides a versatile solution to boost the performance and efficiency of CL. It meets the demand for dynamic buffer allocation, accommodating the varying contributions of different classes and their learning complexities over time.


A Unified Concept-Based System for Local, Global, and Misclassification Explanations

arXiv.org Artificial Intelligence

Explainability of Deep Neural Networks (DNNs) has been garnering increasing attention in recent years. Of the various explainability approaches, concept-based techniques stand out for their ability to utilize human-meaningful concepts instead of focusing solely on individual pixels. However, there is a scarcity of methods that consistently provide both local and global explanations. Moreover, most of the methods have no offer to explain misclassification cases. Considering these challenges, we present a unified concept-based system for unsupervised learning of both local and global concepts. Our primary objective is to uncover the intrinsic concepts underlying each data category by training surrogate explainer networks to estimate the importance of the concepts. Our experimental results substantiated the efficacy of the discovered concepts through diverse quantitative and qualitative assessments, encompassing faithfulness, completeness, and generality. Furthermore, our approach facilitates the explanation of both accurate and erroneous predictions, rendering it a valuable tool for comprehending the characteristics of the target objects and classes.


Mitigating Bias: Enhancing Image Classification by Improving Model Explanations

arXiv.org Artificial Intelligence

Deep learning models have demonstrated remarkable capabilities in learning complex patterns and concepts from training data. However, recent findings indicate that these models tend to rely heavily on simple and easily discernible features present in the background of images rather than the main concepts or objects they are intended to classify. This phenomenon poses a challenge to image classifiers as the crucial elements of interest in images may be overshadowed. In this paper, we propose a novel approach to address this issue and improve the learning of main concepts by image classifiers. Our central idea revolves around concurrently guiding the model's attention toward the foreground during the classification task. By emphasizing the foreground, which encapsulates the primary objects of interest, we aim to shift the focus of the model away from the dominant influence of the background. To accomplish this, we introduce a mechanism that encourages the model to allocate sufficient attention to the foreground. We investigate various strategies, including modifying the loss function or incorporating additional architectural components, to enable the classifier to effectively capture the primary concept within an image. Additionally, we explore the impact of different foreground attention mechanisms on model performance and provide insights into their effectiveness. Through extensive experimentation on benchmark datasets, we demonstrate the efficacy of our proposed approach in improving the classification accuracy of image classifiers. Our findings highlight the importance of foreground attention in enhancing model understanding and representation of the main concepts within images. The results of this study contribute to advancing the field of image classification and provide valuable insights for developing more robust and accurate deep-learning models.


Global-Local Processing in Convolutional Neural Networks

arXiv.org Artificial Intelligence

Convolutional Neural Networks (CNNs) have achieved outstanding performance on image processing challenges. Actually, CNNs imitate the typically developed human brain structures at the micro-level (Artificial neurons). At the same time, they distance themselves from imitating natural visual perception in humans at the macro architectures (high-level cognition). Recently it has been investigated that CNNs are highly biased toward local features and fail to detect the global aspects of their input. Nevertheless, the literature offers limited clues on this problem. To this end, we propose a simple yet effective solution inspired by the unconscious behavior of the human pupil. We devise a simple module called Global Advantage Stream (GAS) to learn and capture the holistic features of input samples (i.e., the global features). Then, the GAS features were combined with a CNN network as a plug-and-play component called the Global/Local Processing (GLP) model. The experimental results confirm that this stream improves the accuracy with an insignificant additional computational/temporal load and makes the network more robust to adversarial attacks. Furthermore, investigating the interpretation of the model shows that it learns a more holistic representation similar to the perceptual system of healthy humans


Quantifying Overfitting: Evaluating Neural Network Performance through Analysis of Null Space

arXiv.org Artificial Intelligence

Machine learning models that are overfitted/overtrained are more vulnerable to knowledge leakage, which poses a risk to privacy. Suppose we download or receive a model from a third-party collaborator without knowing its training accuracy. How can we determine if it has been overfitted or overtrained on its training data? It's possible that the model was intentionally over-trained to make it vulnerable during testing. While an overfitted or overtrained model may perform well on testing data and even some generalization tests, we can't be sure it's not over-fitted. Conducting a comprehensive generalization test is also expensive. The goal of this paper is to address these issues and ensure the privacy and generalization of our method using only testing data. To achieve this, we analyze the null space in the last layer of neural networks, which enables us to quantify overfitting without access to training data or knowledge of the accuracy of those data. We evaluated our approach on various architectures and datasets and observed a distinct pattern in the angle of null space when models are overfitted. Furthermore, we show that models with poor generalization exhibit specific characteristics in this space. Our work represents the first attempt to quantify overfitting without access to training data or knowing any knowledge about the training samples.


A Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution Detection: Solutions and Future Challenges

arXiv.org Artificial Intelligence

Machine learning models often encounter samples that are diverged from the training distribution. Failure to recognize an out-of-distribution (OOD) sample, and consequently assign that sample to an in-class label significantly compromises the reliability of a model. The problem has gained significant attention due to its importance for safety deploying models in open-world settings. Detecting OOD samples is challenging due to the intractability of modeling all possible unknown distributions. To date, several research domains tackle the problem of detecting unfamiliar samples, including anomaly detection, novelty detection, one-class learning, open set recognition, and out-of-distribution detection. Despite having similar and shared concepts, out-of-distribution, open-set, and anomaly detection have been investigated independently. Accordingly, these research avenues have not cross-pollinated, creating research barriers. While some surveys intend to provide an overview of these approaches, they seem to only focus on a specific domain without examining the relationship between different domains. This survey aims to provide a cross-domain and comprehensive review of numerous eminent works in respective areas while identifying their commonalities. Researchers can benefit from the overview of research advances in different fields and develop future methodology synergistically. Furthermore, to the best of our knowledge, while there are surveys in anomaly detection or one-class learning, there is no comprehensive or up-to-date survey on out-of-distribution detection, which our survey covers extensively. Finally, having a unified cross-domain perspective, we discuss and shed light on future lines of research, intending to bring these fields closer together.


Fake It Till You Make It: Towards Accurate Near-Distribution Novelty Detection

arXiv.org Artificial Intelligence

We aim for image-based novelty detection. Despite considerable progress, existing models either fail or face a dramatic drop under the so-called "near-distribution" setting, where the differences between normal and anomalous samples are subtle. We first demonstrate existing methods experience up to 20% decrease in performance in the near-distribution setting. Next, we propose to exploit a score-based generative model to produce synthetic near-distribution anomalous data. Our model is then fine-tuned to distinguish such data from the normal samples. We provide a quantitative as well as qualitative evaluation of this strategy, and compare the results with a variety of GAN-based models. Effectiveness of our method for both the near-distribution and standard novelty detection is assessed through extensive experiments on datasets in diverse applications such as medical images, object classification, and quality control. This reveals that our method considerably improves over existing models, and consistently decreases the gap between the near-distribution and standard novelty detection performance. Such samples are called anomalous, while the training set is referred to as normal. One has access to only normal data during training in ND. Recently, PANDA (33) and CSI (43) have considerably pushed state-of-the-art and achieved more than 90% the area under the receiver operating characteristics (AUROC) on the CIFAR-10 dataset (22) in the ND task, where one class is assumed to be normal and the rest are considered anomalous. However, as we will show empirically, these methods struggle to achieve a similar performance in situations where outliers are semantically close to the normal distribution, e.g. In the literature novelty detection and anomaly detection are used interchangeably.