Towards Efficient Pre-Trained Language Model via Feature Correlation Distillation

Open in new window