d2b752ed4726286a4b488ae16e091d64-Supplemental-Conference.pdf

Neural Information Processing Systems 

Table 3 presents comprehensive details of the TrojAI dataset. PICCOLO is a backdoor scanning tool aiming at detecting whether a language model is backdoored. It cannot reverse engineer exact triggers but optimizes a list of surrogate triggers that can induce ASR. The surrogate triggers by PICCOLO cannot be directly used. Table 4 documents the optimal prompts identified via fuzzing for each model.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found