Weak-to-Strong Backdoor Attack for Large Language Models

Open in new window