1757af1fe1429801bdf3abf5600f8bba-Supplemental-Conference.pdf
–Neural Information Processing Systems
The point of the highest Top1 accuracy is represented by five-pointed star for bothapproaches. In addition, as shown in Fig. r3, theoptimal balance coefficients forMobileNetV3 onCIFAR10, MobileNetV3 onCIFAR100 and ResNet50 on CIFAR100 are 5, 10 and 10 respectively. In the logits distillation, the main loss coefficient is 0.8 and distillation coefficient is 0.2. In the feature distillation, the feature distillation coefficient of each stage is 0.02. ForCIFAR100, we train all 4 networks utilizing standard training setting and do the same test above.
Neural Information Processing Systems
Feb-7-2026, 16:04:54 GMT