Review for NeurIPS paper: Reinforcement Learning with Combinatorial Actions: An Application to Vehicle Routing

Neural Information Processing Systems 

The paper proposes a novel reinforcement learning approach to solving the capacitated vehicle routing problem. It involves learning a value function and solving a TSP for the prizing problem. Reviewers agree that the proposed approach is novel and interesting. One reviewer is sceptical of the work because of doubts about the performance achievable with the proposed approach. However, the ideas presented still deserve to be presented at NeurIPS, with the hope of bringing advances to this research area.