Superscopes: Amplifying Internal Feature Representations for Language Model Interpretation

Open in new window