Persona Vectors: Monitoring and Controlling Character Traits in Language Models

Open in new window