Goto

Collaborating Authors

 partition policy


Hierarchical Learning-based Graph Partition for Large-scale Vehicle Routing Problems

arXiv.org Artificial Intelligence

Neural solvers based on the divide-and-conquer approach for Vehicle Routing Problems (VRPs) in general, and capacitated VRP (CVRP) in particular, integrates the global partition of an instance with local constructions for each subproblem to enhance generalization. However, during the global partition phase, misclusterings within subgraphs have a tendency to progressively compound throughout the multi-step decoding process of the learning-based partition policy. This suboptimal behavior in the global partition phase, in turn, may lead to a dramatic deterioration in the performance of the overall decomposition-based system, despite using optimal local constructions. To address these challenges, we propose a versatile Hierarchical Learning-based Graph Partition (HLGP) framework, which is tailored to benefit the partition of CVRP instances by synergistically integrating global and local partition policies. Specifically, the global partition policy is tasked with creating the coarse multi-way partition to generate the sequence of simpler two-way partition subtasks. These subtasks mark the initiation of the subsequent K local partition levels. At each local partition level, subtasks exclusive for this level are assigned to the local partition policy which benefits from the insensitive local topological features to incrementally alleviate the compounded errors. This framework is versatile in the sense that it optimizes the involved partition policies towards a unified objective harmoniously compatible with both reinforcement learning (RL) and supervised learning (SL). (*Due to the notification of arXiv "The Abstract field cannot be longer than 1,920 characters", the appeared Abstract is shortened. For the full Abstract, please download the Article.)


DVFS-Aware DNN Inference on GPUs: Latency Modeling and Performance Analysis

arXiv.org Artificial Intelligence

The rapid development of deep neural networks (DNNs) is inherently accompanied by the problem of high computational costs. To tackle this challenge, dynamic voltage frequency scaling (DVFS) is emerging as a promising technology for balancing the latency and energy consumption of DNN inference by adjusting the computing frequency of processors. However, most existing models of DNN inference time are based on the CPU-DVFS technique, and directly applying the CPU-DVFS model to DNN inference on GPUs will lead to significant errors in optimizing latency and energy consumption. In this paper, we propose a DVFS-aware latency model to precisely characterize DNN inference time on GPUs. We first formulate the DNN inference time based on extensive experiment results for different devices and analyze the impact of fitting parameters. Then by dividing DNNs into multiple blocks and obtaining the actual inference time, the proposed model is further verified. Finally, we compare our proposed model with the CPU-DVFS model in two specific cases. Evaluation results demonstrate that local inference optimization with our proposed model achieves a reduction of no less than 66% and 69% in inference time and energy consumption respectively. In addition, cooperative inference with our proposed model can improve the partition policy and reduce the energy consumption compared to the CPU-DVFS model.