To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models

Open in new window