Self-Ablating Transformers: More Interpretability, Less Sparsity

Open in new window