Rationalizing Transformer Predictions via End-To-End Differentiable Self-Training

Open in new window