the Fine tuning Process of on Poisoned

Apr-30-2026, 03:38:50 GMT–Neural Information Processing Systems

In this section, we show our empirical observations obtained from fine-tuning PLMs on poisoned494 datasets. Specifically, we demonstrate that the backdoor triggers are easier to learn from the lower495 layers than the features corresponding to the main task. This observation plays a pivotal role in496 designing and understanding our defense algorithm. In our experiment, we focus on the SST-2497 dataset [30] and consider the widely adopted word-level backdoor trigger and the more stealthy498 style-level trigger. For the word-level trigger, we follow the approach in prior work [25] and adopt the499 meaningless word "bb" as the trigger to minimize its impact on the original text's semantic meaning.500

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Apr-30-2026, 03:38:50 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > New Finding (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.70)

Duplicate Docs Excel Report

Title
e7938ede51225b490bb69f7b361a9259-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found