The authors sincerely thank all the reviewers for their very constructive and helpful comments

Neural Information Processing Systems 

The authors sincerely thank all the reviewers for their very constructive and helpful comments. Prune: 40% of the channels are pruned. Empirically, this decay policy works better than a constant one: 78.16% verses KD together with Dirac is provided in Table 3 and Table 4 instead (the KD(MSE)+Dirac column). ResNet-18 and plain CNN18 (naive training) is 77.92% and 77.44%, respectively.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found