Self-Distillation Mixup Training for Non-autoregressive Neural Machine Translation

Open in new window