Defending LLMs against Jailbreaking Attacks via Backtranslation

Open in new window