Identifiable Steering via Sparse Autoencoding of Multi-Concept Shifts

Open in new window