Goto

Collaborating Authors

 deletingmorelayersand


1757af1fe1429801bdf3abf5600f8bba-Supplemental-Conference.pdf

Neural Information Processing Systems

The point of the highest Top1 accuracy is represented by five-pointed star for bothapproaches. In addition, as shown in Fig. r3, theoptimal balance coefficients forMobileNetV3 onCIFAR10, MobileNetV3 onCIFAR100 and ResNet50 on CIFAR100 are 5, 10 and 10 respectively. In the logits distillation, the main loss coefficient is 0.8 and distillation coefficient is 0.2. In the feature distillation, the feature distillation coefficient of each stage is 0.02. ForCIFAR100, we train all 4 networks utilizing standard training setting and do the same test above.