Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation

Open in new window