Priority-Aware Model-Distributed Inference at Edge Networks

Dec-16-2024–arXiv.org Artificial Intelligence

Distributed inference techniques can be broadly classified into data-distributed and model-distributed schemes. In data-distributed inference (DDI), each worker carries the entire Machine Learning (ML) model but processes only a subset of the data. However, feeding the data to workers results in high communication costs, especially when the data is large. An emerging paradigm is model-distributed inference (MDI), where each worker carries only a subset of ML layers. In MDI, a source device that has data processes a few layers of ML model and sends the output to a neighboring device, i.e., offloads the rest of the layers. This process ends when all layers are processed in a distributed manner. In this paper, we investigate the design and development of MDI when multiple data sources co-exist. We consider that each data source has a different importance and, hence, a priority. We formulate and solve a priority-aware model allocation optimization problem. Based on the structure of the optimal solution, we design a practical Priority-Aware Model- Distributed Inference (PA-MDI) algorithm that determines model allocation and distribution over devices by taking into account the priorities of different sources. Experiments were conducted on a real-life testbed of NVIDIA Jetson Xavier and Nano edge devices as well as in the Colosseum testbed with ResNet-50, ResNet- 56, and GPT-2 models. The experimental results show that PA-MDI performs priority-aware model allocation successfully while reducing the inference time as compared to baselines.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Dec-16-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States > Illinois (0.14)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Information Technology (0.68)
- Telecommunications (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language (1.00)
  - Representation & Reasoning > Optimization (1.00)