MCU-MixQ: A HW/SW Co-optimized Mixed-precision Neural Network Design Framework for MCUs

Gong, Junfeng, Liu, Cheng, Cheng, Long, Li, Huawei, Li, Xiaowei

arXiv.org Artificial Intelligence 

MCU-MixQ: A HW/SW Co-optimized Mixed-precision Neural Network Design Framework for MCUs Junfeng Gong 1, 2, Cheng Liu 1, 2, Long Cheng 3, Huawei Li 1, 2, Xiaowei Li 1, 2 1 SKLP, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2 Dept. of Computer Science, University of Chinese Academy of Sciences, Beijing, China 3 School of Control and Computer Engineering, North China Electric Power University, Beijing, China Abstract --Mixed-precision neural network (MPNN) that utilizes just enough data width for the neural network processing is an effective approach to meet the stringent resources constraints including memory and computing of MCUs. Nevertheless, there is still a lack of sub-byte and mixed-precision SIMD operations in MCU-class ISA and the limited computing capability of MCUs remains underutilized, which further aggravates the computing bound encountered in neural network processing. As a result, the benefits of MPNNs cannot be fully unleashed. In this work, we propose to pack multiple low-bitwidth arithmetic operations within a single instruction multiple data (SIMD) instructions in typical MCUs, and then develop an efficient convolution operator by exploring both the data parallelism and computing parallelism in convolution along with the proposed SIMD packing. Finally, we further leverage Neural Architecture Search (NAS) to build a HW/SW co-designed MPNN design framework, namely MCU-MixQ. According to our experiment results, MCU-MixQ achieves 2.1 and 1.4 speedup over CMix-NN and MCUNet respectively under the same resource constraints. I NTRODUCTION The application of Artificial intelligence (AI) has become prevalent in typical Internet of Things (IoT) scenarios such as health monitoring, mechanical equipment fault diagnosis, and industrial automation. These applications commonly rely on microcontrollers (MCUs) known for their ultra-low power consumption and cost as the central processing units.