d2b752ed4726286a4b488ae16e091d64-Supplemental-Conference.pdf
–Neural Information Processing Systems
Table 3 presents comprehensive details of the TrojAI dataset. PICCOLO is a backdoor scanning tool aiming at detecting whether a language model is backdoored. It cannot reverse engineer exact triggers but optimizes a list of surrogate triggers that can induce ASR. The surrogate triggers by PICCOLO cannot be directly used. Table 4 documents the optimal prompts identified via fuzzing for each model.
Neural Information Processing Systems
Feb-17-2026, 06:54:12 GMT