Sparse Autoencoders Reveal Universal Feature Spaces Across Large Language Models

Open in new window