MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization
–Neural Information Processing Systems
Tremendous amount of parameters make deep neural networks impractical to be deployed for edge-device-based real-world applications due to the limit of computational power and storage space. Existing studies have made progress on learning quantized deep models to reduce model size and energy consumption, i.e. converting full-precision weights ( r's) into discrete values ( q's) in a supervised training manner. However, the training process for quantization is non-differentiable, which leads to either infinite or zero gradients ( g_r) w.r.t. To address this problem, most training-based quantization methods use the gradient w.r.t. However, these methods only heuristically make training-based quantization applicable, without further analysis on how the approximated gradients can assist training of a quantized network.
Neural Information Processing Systems
Oct-11-2024, 07:07:59 GMT
- Technology: