Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages

Open in new window