Keep Calm and Avoid Harmful Content: Concept Alignment and Latent Manipulation Towards Safer Answers

Open in new window