CIMNAS: A Joint Framework for Compute-In-Memory-Aware Neural Architecture Search

Krestinskaya, Olga, Fouda, Mohammed E., Eltawil, Ahmed, Salama, Khaled N.

Oct-1-2025–arXiv.org Artificial Intelligence

Abstract--T o maximize hardware efficiency and performance accuracy in Compute-In-Memory (CIM)-based neural network accelerators for Artificial Intelligence (AI) applications, co-optimizing both software and hardware design parameters is essential. Manual tuning is impractical due to the vast number of parameters and their complex interdependencies. T o effectively automate the design and optimization of CIM-based neural network accelerators, hardware-aware neural architecture search (HW-NAS) techniques can be applied. This work introduces CIMNAS, a joint model-quantization-hardware optimization framework for CIM architectures. CIMNAS simultaneously searches across software parameters, quantization policies, and a broad range of hardware parameters, incorporating device-, circuit-, and architecture-level co-optimizations. CIMNAS experiments were conducted over a search space of 9.9 10 Evaluated on the ImageNet dataset, CIMNAS achieved a reduction in energy-delay-area product (EDAP) ranging from 90.1 to 104.5, an improvement in TOPS/W between 4.68 and 4.82, and an enhancement in TOPS/mm The adaptability and robustness of CIMNAS are demonstrated by extending the framework to support the SRAM-based ResNet50 architecture, achieving up to an 819.5 reduction in EDAP . Unlike other state-of-the-art methods, CIMNAS achieves EDAP-focused optimization without any accuracy loss, generating diverse software-hardware parameter combinations for high-performance CIMbased neural network designs. The exponential growth of Artificial Intelligence (AI) applications and increasing AI model complexity are raising the energy demands for training and processing AI workloads [1]. This trend has created a demand for more sustainable and energy-efficient hardware solutions for AI applications. Compute-In-Memory (CIM) neural network accelerators have emerged as promising architectures for achieving energy-efficient AI processing [2]-[6]. To maximize the hardware efficiency of CIM accelerators and maintain high performance for neural network workloads, it is essential to co-optimize both neural network model parameters and CIM hardware parameters [7]. Mohammed Fouda is with Compumacy for Artificial Intelligence Solutions, Cairo, Egypt.

artificial intelligence, machine learning, search space, (16 more...)

arXiv.org Artificial Intelligence

Oct-1-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.46)
- Africa > Middle East
  - Egypt > Cairo Governorate > Cairo (0.24)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Energy (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found