Can the Variation of Model Weights be used as a Criterion for Self-Paced Multilingual NMT?

Open in new window