Controlling Large Language Models Through Concept Activation Vectors

Open in new window