Multi-head attention debiasing and contrastive learning for mitigating Dataset Artifacts in Natural Language Inference

Open in new window