eb1e78328c46506b46a4ac4a1e378b91-AuthorFeedback.pdf

Neural Information Processing Systems 

However, a model pre-trained with SGD suffers from the problem4 shown in Figure 1 (line 44-49). To tackle this issue, we train a compression-friendly model5 atfull-precision (FP) with cross-entropylossusing PSGD. InTable1ofACIQ[2],thenaive(channel-wise) baseline16 of ResNet-18 W4A4 (ImageNet) is 51.6% as opposed to 0.3% for ours (layer-wise). Weperformed additional experiments using amodel trained with20 PSGD then post-processing with a concurrent PTQ work, LAPQ [34], using the official code. Reviewer 3 (R3): Thank you for the meaningful feedback.I.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found