Goto

Collaborating Authors

 adversarial perturbation


Functional Virtual Adversarial Training for Semi-Supervised Time Series Classification

Neural Information Processing Systems

Real-world time series analysis, such as healthcare, autonomous driving, and solar energy, faces unique challenges arising from the scarcity of labeled data, highlighting the need for effective semi-supervised learning methods. While the Virtual Adversarial Training (VAT) method has shown promising performance in leveraging unlabeled data for smoother predictive distributions, straightforward extensions of VAT often fall short on time series tasks as they neglect the temporal structure of the data in the adversarial perturbation. In this paper, we propose the framework of functional Virtual Adversarial Training (f-VAT) that can incorporate the functional structure of the data into perturbations. By theoretically establishing a duality between the perturbation norm and the functional model sensitivity, we propose to use an appropriate Sobolev (H s) norm to generate structured functional adversarial perturbations for semi-supervised time series classification. Our proposed f-VAT method outperforms recent methods and achieves superior performance in extensive semi-supervised time series classification tasks (e.g., up to 9% performance improvement). We also provide additional visualization studies to offer further insights into the superiority of f-VAT.


Dual-Flow: Transferable Multi-Target, Instance-Agnostic Attacks via In-the-wild Cascading Flow Optimization

Neural Information Processing Systems

Adversarial attacks are widely used to evaluate model robustness, and in black-box scenarios, the transferability of these attacks becomes crucial. Existing generatorbased attacks have excellent generalization and transferability due to their instanceagnostic nature. However, when training generators for multi-target tasks, the success rate of transfer attacks is relatively low due to the limitations of the model's capacity. To address these challenges, we propose a novel Dual-Flow framework for multi-target instance-agnostic adversarial attacks, utilizing Cascading Distribution Shift Training to develop an adversarial velocity function. Extensive experiments demonstrate that Dual-Flow significantly improves transferability over previous multi-target generative attacks. For example, it increases the success rate from Inception-v3 to ResNet-152 by 34.58%. Furthermore, our attack method shows substantially stronger robustness against defense mechanisms, such as adversarially trained models. The code of Dual-Flow is available at: https://github.com/Chyxx/Dual-Flow.



FrameShield: Adversarially Robust Video Anomaly Detection

Neural Information Processing Systems

Weakly Supervised Video Anomaly Detection (WSVAD) has achieved notable advancements, yet existing models remain vulnerable to adversarial attacks, limiting their reliability. Due to the inherent constraints of weak supervision--where only video-level labels are provided despite the need for frame-level predictions--traditional adversarial defense mechanisms, such as adversarial training, are not effective since video-level adversarial perturbations are typically weak and inadequate. To address this limitation, pseudo-labels generated directly from the model can enable frame-level adversarial training; however, these pseudo-labels are inherently noisy, significantly degrading performance. We therefore introduce a novel Pseudo-Anomaly Generation method called Spatiotemporal Region Distortion (SRD), which creates synthetic anomalies by applying severe augmentations to localized regions in normal videos while preserving temporal consistency. Integrating these precisely annotated synthetic anomalies with the noisy pseudolabels substantially reduces label noise, enabling effective adversarial training. Extensive experiments demonstrate that our method significantly enhances the robustness of WSVAD models against adversarial attacks, outperforming state-ofthe-art methods by an average of 71.0% in overall AUROC performance across multiple benchmarks.


Functional Virtual Adversarial Training for Semi-Supervised Time Series Classification

Neural Information Processing Systems

Real-world time series analysis, such as healthcare, autonomous driving, and solar energy, faces unique challenges arising from the scarcity of labeled data, highlighting the need for effective semi-supervised learning methods. While the Virtual Adversarial Training (VAT) method has shown promising performance in leveraging unlabeled data for smoother predictive distributions, straightforward extensions of VAT often fall short on time series tasks as they neglect the temporal structure of the data in the adversarial perturbation. In this paper, we propose the framework of functional Virtual Adversarial Training (f-VAT) that can incorporate the functional structure of the data into perturbations. By theoretically establishing a duality between the perturbation norm and the functional model sensitivity, we propose to use an appropriate Sobolev ($H^{-s}$) norm to generate structured functional adversarial perturbations for semi-supervised time series classification. Our proposed f-VAT method outperforms recent methods and achieves superior performance in extensive semi-supervised time series classification tasks (e.g., up to $ \approx 9$ % performance improvement). We also provide additional visualization studies to offer further insights into the superiority of f-VAT.


Fit the Distribution: Cross-Image/Prompt Adversarial Attacks on Multimodal Large Language Models

Neural Information Processing Systems

Although Multimodal Large Language Models (MLLMs) have demonstrated remarkable achievements in recent years, they remain vulnerable to adversarial examples that result in harmful responses. Existing attacks typically focus on optimizing adversarial perturbations for a certain multimodal image-prompt pair or fixed training dataset, which often leads to overfitting. Consequently, these perturbations fail to remain malicious once transferred to attack unseen image-prompt pairs, suffering from significant resource costs to cover the diverse multimodal inputs in complicated real-world scenarios. To alleviate this issue, this paper proposes a novel adversarial attack on MLLMs based on distribution approximation theory, which models the potential image-prompt input distribution and adds the same distribution-fitting adversarial perturbation on multimodal input pairs to achieve effective cross-image/prompt transfer attacks. Specifically, we exploit the Laplace approximation to model the Gaussian distribution of the image and prompt inputs for the MLLM, deriving an estimate of the mean and covariance parameters. By sampling from this approximated distribution with Monte Carlo mechanism, we efficiently optimize and fit a single input agnostic perturbation over diverse image prompt pairs, yielding strong universality and transferability. Extensive experiments are conducted to verify the strong adversarial capabilities of our proposed attack against prevalent MLLMs spanning a spectrum of images/prompts.


Enhancing CLIP Robustness via Cross-Modality Alignment

Neural Information Processing Systems

Vision-language models (VLMs) such as CLIP demonstrate strong generalization in zero-shot classification but remain highly vulnerable to adversarial perturbations. Existing methods primarily focus on adversarial fine-tuning or prompt optimization, they often overlook the gaps in CLIP's encoded features, which is shown as the text and image features lie far apart from each other. This misalignment is significantly amplified under adversarial perturbations, leading to severe degradation in classification performance.




Robustness of classifiers: from adversarial to random noise

Neural Information Processing Systems

Several recent works have shown that state-of-the-art classifiers are vulnerable to worst-case (i.e., adversarial) perturbations of the datapoints. On the other hand, it has been empirically observed that these same classifiers are relatively robust to random noise. In this paper, we propose to study a semi-random noise regime that generalizes both the random and worst-case noise regimes. We propose the first quantitative analysis of the robustness of nonlinear classifiers in this general noise regime. We establish precise theoretical bounds on the robustness of classifiers in this general regime, which depend on the curvature of the classifier's decision boundary. Our bounds confirm and quantify the empirical observations that classifiers satisfying curvature constraints are robust to random noise. Moreover, we quantify the robustness of classifiers in terms of the subspace dimension in the semi-random noise regime, and show that our bounds remarkably interpolate between the worst-case and random noise regimes. We perform experiments and show that the derived bounds provide very accurate estimates when applied to various state-of-the-art deep neural networks and datasets. This result suggests bounds on the curvature of the classifiers' decision boundaries that we support experimentally, and more generally offers important insights onto the geometry of high dimensional classification problems.