Enhancing Neural Network Interpretability with Feature-Aligned Sparse Autoencoders

Open in new window