On the Role of Attention Heads in Large Language Model Safety

Open in new window