Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs

Open in new window