Contrast-CAT: Contrasting Activations for Enhanced Interpretability in Transformer-based Text Classifiers

Open in new window