Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

Open in new window