Investigating Bias Representations in Llama 2 Chat via Activation Steering

Open in new window