Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Open in new window