Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning

Open in new window