eb1e78328c46506b46a4ac4a1e378b91-AuthorFeedback.pdf

Feb-10-2026, 23:18:55 GMT–Neural Information Processing Systems

However, a model pre-trained with SGD suffers from the problem4 shown in Figure 1 (line 44-49). To tackle this issue, we train a compression-friendly model5 atfull-precision (FP) with cross-entropylossusing PSGD. InTable1ofACIQ[2],thenaive(channel-wise) baseline16 of ResNet-18 W4A4 (ImageNet) is 51.6% as opposed to 0.3% for ours (layer-wise). Weperformed additional experiments using amodel trained with20 PSGD then post-processing with a concurrent PTQ work, LAPQ [34], using the official code. Reviewer 3 (R3): Thank you for the meaningful feedback.I.

accuracy, arxivpreprintarxiv, yyyt, (5 more...)

Neural Information Processing Systems

Feb-10-2026, 23:18:55 GMT

Conferences PDF

Add feedback

Duplicate Docs Excel Report

Title
Figure 5: Loss surface using [35]; SGD (top) and PSGD (bottom)

Similar Docs Excel Report more

Title	Similarity	Source
None found