Entropy-Driven Mixed-Precision Quantization for Deep Network Design

Jan-17-2025, 03:35:00 GMT–Neural Information Processing Systems

Deploying deep convolutional neural networks on Internet-of-Things (IoT) devices is challenging due to the limited computational resources, such as limited SRAM memory and Flash storage. Previous works re-design a small network for IoT devices, and then compress the network size by mixed-precision quantization. In this work, we propose a one-stage solution that optimizes both jointly and automatically. The key idea of our approach is to cast the joint architecture design and quantization as an Entropy Maximization process. Particularly, our algorithm automatically designs a tiny deep model such that: 1) Its representation capacity measured by entropy is maximized under the given computational budget; 2) Each layer is assigned with a proper quantization precision; 3) The overall design loop can be done on CPU, and no GPU is required.

deep network design, entropy-driven mixed-precision quantization, iot device

Neural Information Processing Systems

Jan-17-2025, 03:35:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Internet of Things (0.90)
  - Artificial Intelligence > Machine Learning
    - Neural Networks (0.79)