SDPGO: Efficient Self-Distillation Training Meets Proximal Gradient Optimization

Open in new window