Goto

Collaborating Authors

 exemplar distribution


Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation (Appendix) A Details of the considered distributions

Neural Information Processing Systems

In this paper, we consider various distributions for the node coordinates in VRPs, followed which we randomly generate instances for both training and testing. Below we present details on how to generate those instances. It considers uniformly distributed nodes. An exemplary instance is displayed in Figure 1(i). It considers a mixture of the two distributions above, each with half of the nodes.



Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation (Appendix) A Details of the considered distributions

Neural Information Processing Systems

In this paper, we consider various distributions for the node coordinates in VRPs, followed which we randomly generate instances for both training and testing. Below we present details on how to generate those instances. It considers uniformly distributed nodes. An exemplary instance is displayed in Figure 1(i). It considers a mixture of the two distributions above, each with half of the nodes.



Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation

Bi, Jieyi, Ma, Yining, Wang, Jiahai, Cao, Zhiguang, Chen, Jinbiao, Sun, Yuan, Chee, Yeow Meng

arXiv.org Artificial Intelligence

Recent neural methods for vehicle routing problems always train and test the deep models on the same instance distribution (i.e., uniform). To tackle the consequent cross-distribution generalization concerns, we bring the knowledge distillation to this field and propose an Adaptive Multi-Distribution Knowledge Distillation (AMDKD) scheme for learning more generalizable deep models. Particularly, our AMDKD leverages various knowledge from multiple teachers trained on exemplar distributions to yield a light-weight yet generalist student model. Meanwhile, we equip AMDKD with an adaptive strategy that allows the student to concentrate on difficult distributions, so as to absorb hard-to-master knowledge more effectively. Extensive experimental results show that, compared with the baseline neural methods, our AMDKD is able to achieve competitive results on both unseen in-distribution and out-of-distribution instances, which are either randomly synthesized or adopted from benchmark datasets (i.e., TSPLIB and CVRPLIB). Notably, our AMDKD is generic, and consumes less computational resources for inference.


Note on Learning Rate Schedules for Stochastic Optimization

Darken, Christian, Moody, John E.

Neural Information Processing Systems

We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation and k-means clustering as special cases. We introduce "search-thenconverge" type schedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.


Note on Learning Rate Schedules for Stochastic Optimization

Darken, Christian, Moody, John E.

Neural Information Processing Systems

We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation and k-means clustering as special cases. We introduce "search-thenconverge" type schedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.


Note on Learning Rate Schedules for Stochastic Optimization

Darken, Christian, Moody, John E.

Neural Information Processing Systems

We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation andk-means clustering as special cases. We introduce "search-thenconverge" typeschedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.