Leveraging Inter-Layer Dependency for Post -Training Quantization

Oct-10-2024, 11:43:16 GMT–Neural Information Processing Systems

Prior works on Post-training Quantization (PTQ) typically separate a neural network into sub-nets and quantize them sequentially. This process pays little attention to the dependency across the sub-nets, hence is less optimal. In this paper, we propose a novel Network-Wise Quantization (NWQ) approach to fully leveraging inter-layer dependency. NWQ faces a larger scale combinatorial optimization problem of discrete variables than in previous works, which raises two major challenges: over-fitting and discrete optimization problem. NWQ alleviates over-fitting via a Activation Regularization (AR) technique, which better controls the activation distribution.

leveraging inter-layer dependency, optimization problem, quantization, (1 more...)

Neural Information Processing Systems

Oct-10-2024, 11:43:16 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)