Rethinking Sparse Autoencoders: Select-and-Project for Fairness and Control from Encoder Features Alone

Open in new window