GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model

Open in new window