Efficient Deployment of CNN Models on Multiple In-Memory Computing Units

Bougioukou, Eleni, Antonakopoulos, Theodore

Nov-10-2025–arXiv.org Artificial Intelligence

Abstract--In-Memory Computing (IMC) represents a paradigm shift in deep learning acceleration by mitigating data movement bottlenecks and leveraging the inherent parallelism of memory-based computations. In this work, we exploit an IMC Emulator (IMCE) with multiple Processing Units (PUs) for investigating how the deployment of a CNN model in a multi-processing system affects its performance, in terms of processing rate and latency. For that purpose, we introduce the Load-Balance-Longest-Path (LBLP) algorithm, that dynamically assigns all CNN nodes to the available IMCE PUs, for maximizing the processing rate and minimizing latency due to efficient resources utilization. We are benchmarking LBLP against other alternative scheduling strategies for a number of CNN models and experimental results demonstrate the effectiveness of the proposed algorithm. With the rapid growth of the Internet of Things (IoT) and Cloud Computing, there is a growing need for efficient deep learning models that can operate on diverse computing platforms, ranging from resource-constrained edge devices to high-performance data centers. Among others, Convolutional Neural Networks (CNNs) have become a cornerstone of deep learning [1], driving advances in image classification, object detection, and other computer vision tasks.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

Nov-10-2025

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)

Genre:
- Research Report (0.70)

Industry:
- Information Technology (0.87)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found