CIMNAS: A Joint Framework for Compute-In-Memory-Aware Neural Architecture Search
Krestinskaya, Olga, Fouda, Mohammed E., Eltawil, Ahmed, Salama, Khaled N.
–arXiv.org Artificial Intelligence
Abstract--T o maximize hardware efficiency and performance accuracy in Compute-In-Memory (CIM)-based neural network accelerators for Artificial Intelligence (AI) applications, co-optimizing both software and hardware design parameters is essential. Manual tuning is impractical due to the vast number of parameters and their complex interdependencies. T o effectively automate the design and optimization of CIM-based neural network accelerators, hardware-aware neural architecture search (HW-NAS) techniques can be applied. This work introduces CIMNAS, a joint model-quantization-hardware optimization framework for CIM architectures. CIMNAS simultaneously searches across software parameters, quantization policies, and a broad range of hardware parameters, incorporating device-, circuit-, and architecture-level co-optimizations. CIMNAS experiments were conducted over a search space of 9.9 10 Evaluated on the ImageNet dataset, CIMNAS achieved a reduction in energy-delay-area product (EDAP) ranging from 90.1 to 104.5, an improvement in TOPS/W between 4.68 and 4.82, and an enhancement in TOPS/mm The adaptability and robustness of CIMNAS are demonstrated by extending the framework to support the SRAM-based ResNet50 architecture, achieving up to an 819.5 reduction in EDAP . Unlike other state-of-the-art methods, CIMNAS achieves EDAP-focused optimization without any accuracy loss, generating diverse software-hardware parameter combinations for high-performance CIMbased neural network designs. The exponential growth of Artificial Intelligence (AI) applications and increasing AI model complexity are raising the energy demands for training and processing AI workloads [1]. This trend has created a demand for more sustainable and energy-efficient hardware solutions for AI applications. Compute-In-Memory (CIM) neural network accelerators have emerged as promising architectures for achieving energy-efficient AI processing [2]-[6]. To maximize the hardware efficiency of CIM accelerators and maintain high performance for neural network workloads, it is essential to co-optimize both neural network model parameters and CIM hardware parameters [7]. Mohammed Fouda is with Compumacy for Artificial Intelligence Solutions, Cairo, Egypt.
arXiv.org Artificial Intelligence
Oct-1-2025
- Country:
- Africa > Middle East
- Egypt > Cairo Governorate > Cairo (0.24)
- Asia
- China (0.04)
- Middle East > Saudi Arabia
- Mecca Province > Thuwal (0.04)
- Africa > Middle East
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Energy (0.34)
- Technology: