BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer

Oct-9-2024, 10:40:49 GMT–Neural Information Processing Systems

As the applications of deep learning models on edge devices increase at an accelerating pace, fast adaptation to various scenarios with varying resource constraints has become a crucial aspect of model deployment. As a result, model optimization strategies with adaptive configuration are becoming increasingly popular. While single-shot quantized neural architecture search enjoys flexibility in both model architecture and quantization policy, the combined search space comes with many challenges, including instability when training the weight-sharing supernet and difficulty in navigating the exponentially growing search space. Existing methods tend to either limit the architecture search space to a small set of options or limit the quantization policy search space to fixed precision policies. To this end, we propose BatchQuant, a robust quantizer formulation that allows fast and stable training of a compact, single-shot, mixed-precision, weight-sharing supernet.

batchquant, quantized-for-all architecture search, robust quantizer, (5 more...)

Neural Information Processing Systems

Oct-9-2024, 10:40:49 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Cognitive Science (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.61)