The authors sincerely thank all the reviewers for their very constructive and helpful comments
–Neural Information Processing Systems
The authors sincerely thank all the reviewers for their very constructive and helpful comments. Prune: 40% of the channels are pruned. Empirically, this decay policy works better than a constant one: 78.16% verses KD together with Dirac is provided in Table 3 and Table 4 instead (the KD(MSE)+Dirac column). ResNet-18 and plain CNN18 (naive training) is 77.92% and 77.44%, respectively.
Neural Information Processing Systems
Oct-3-2025, 02:38:46 GMT
- Technology: