Figure 5: Loss surface using [35]; SGD (top) and PSGD (bottom)
–Neural Information Processing Systems
We thank the reviewers for their positive and constructive feedbacks. Note that our PSGD has a similar accuracy with the SGD-trained model at FP . A similar rationale is given in Sec. Note that at lower bits such as W2A8, we attain 62.7% accuracy, while LAPQ has 1.3% accuracy. The detailed definition and proof are in [38].
Neural Information Processing Systems
Aug-17-2025, 03:16:07 GMT
- Technology: