AITopics | victim model

Collaborating Authors

victim model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Backpropagating Linearly Improves Transferability of Adversarial Examples (Supplementary Material)

Neural Information Processing SystemsApr-30-2026, 19:57:26 GMT

Empirical results in Section 3.1 in the main paper show that simply removing ReLUs lead to improved transferability. In this section, we try freezing all learnable parameters in the unmodified sub-net h during fine-tuning and a similar observation about the initial improvement of transferability can still be decrease made and (see finally Figure the 5). Classification loss of these modified VGG-19 models on the benign CIFAR-10 test set is also reported, in Figure 6. On ImageNet, it is evaluated on the 50000official validation images. As mentioned in the main paper, many recent successes in improving adversarial transferability benefit from maximizing intermediate level distortions rather than the final prediction losses [8, 3, 2] of DNNs.

artificial intelligence, machine learning, source model, (16 more...)

Neural Information Processing Systems

Country: North America (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Backpropagating Linearly Improves Transferability of Adversarial Examples

Neural Information Processing SystemsApr-30-2026, 19:57:19 GMT

The vulnerability of deep neural networks (DNNs) to adversarial examples has drawn great attention from the community. In this paper, we study the transferability of such examples, which lays the foundation of many black-box attacks on DNNs. We revisit a not so new but definitely noteworthy hypothesis of Goodfellow et al.'s and disclose that the transferability can be enhanced by improving the linearity of DNNs in an appropriate manner. We introduce linear backpropagation (LinBP), a method that performs backpropagation in a more linear fashion using off-the-shelf attacks that exploit gradients. More specifically, it calculates forward as normal but backpropagates loss as if some nonlinear activations are not encountered in the forward pass. Experimental results demonstrate that this simple yet effective method obviously outperforms current state-of-the-arts in crafting transferable adversarial examples on CIFAR-10 and ImageNet, leading to more effective attacks on a variety of DNNs.

adversarial example, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (0.50)
Government > Military (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Adversarial Attacks on Graph Classification via Bayesian Optimisation

Neural Information Processing SystemsApr-25-2026, 11:51:29 GMT

Graph neural networks, a popular class of models effective in a wide range of graph-based learning tasks, have been shown to be vulnerable to adversarial attacks. While the majority of the literature focuses on such vulnerability in node-level classification tasks, little effort has been dedicated to analysing adversarial attacks on graph-level classification, an important problem with numerous real-life applications such as biochemistry and social network analysis. The few existing methods often require unrealistic setups, such as access to internal information of the victim models, or an impractically-large number of queries. We present a novel Bayesian optimisation-based attack method for graph classification models. Our method is black-box, query-efficient and parsimonious with respect to the perturbation applied. We empirically validate the effectiveness and flexibility of the proposed method on a wide range of graph classification tasks involving varying graph properties, constraints and modes of attack. Finally, we analyse common interpretable patterns behind the adversarial samples produced, which may shed further light on the adversarial robustness of graph classification models.

artificial intelligence, graph, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

2433fec2144ccf5fea1c9c5ebdbc3924-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 02:26:24 GMT

artificial intelligence, ca ter, natural language, (16 more...)

Neural Information Processing Systems

Industry: Information Technology (0.32)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.48)

Add feedback

CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks

Neural Information Processing SystemsApr-25-2026, 02:26:20 GMT

Previous works have validated that text generation APIs can be stolen through imitation attacks, causing IP violations. In order to protect the IP of text generation APIs, recent work has introduced a watermarking algorithm and utilized the null-hypothesis test as a post-hoc ownership verification on the imitation models. However, we find that it is possible to detect those watermarks via sufficient statistics of the frequencies of candidate watermarking words. To address this drawback, in this paper, we propose a novel Conditional wATERmarking framework (CATER) for protecting the IP of text generation APIs. An optimization method is proposed to decide the watermarking rules that can minimize the distortion of overall word distributions while maximizing the change of conditional word selections. Theoretically, we prove that it is infeasible for even the savviest attacker (they know how CATER works) to reveal the used watermarks from a large pool of potential word pairs based on statistical inspection. Empirically, we observe that high-order conditions lead to an exponential growth of suspicious (unused) watermarks, making our crafted watermarks more stealthy. In addition, CATER can effectively identify IP infringement under architectural mismatch and cross-domain imitation attacks, with negligible impairments on the generation quality of victim APIs. We envision our work as a milestone for stealthily protecting the IP of text generation APIs.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > United States (0.68)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

23b9d4e18b151ba2108fb3f1efaf8de4-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 02:08:23 GMT

artificial intelligence, machine learning, surrogate model, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

23b9d4e18b151ba2108fb3f1efaf8de4-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 02:08:19 GMT

artificial intelligence, machine learning, query, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (0.48)
Government > Military (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Discussion of Evaluation Methodologies

Neural Information Processing SystemsApr-25-2026, 01:14:47 GMT

In previous research, there are plenty of arguments about textual backdoor evaluation, including diverse metrics and experiment settings. These valuable discussions motivate us to construct a rigorous benchmark and we highly appreciate their efforts. In this section, we briefly summarize existing opinions and provide a more detailed discussion on this topic. Table 9 summarizes the attackers OpenBackdoorimplements. Effectiveness Besides the mainstream ASR (also called LFR [20]) and CACC metrics, there are also other effectiveness metrics. Shen et al. [46] proposed to count the number of inserted triggers that can successfully flip the label. However, although inserting more triggers could benefit attack strength, the triggers also corrupt the sentences gradually, so it is also possible that the poisoned samples become "adversarial", and we can hardly distinguish. Shen et al. [45] also mentioned this issue, and they advised calculating the ASR difference between a poisoned model and a clean model as an effectiveness metric.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Industry: