Improving Steering Vectors by Targeting Sparse Autoencoder Features

Open in new window