bi-GRPO: Bidirectional Optimization for Jailbreak Backdoor Injection on LLMs

Open in new window