Do Sparse Autoencoders Generalize? A Case Study of Answerability

Open in new window