vrp
Reinforcement Learning for Solving the Vehicle Routing Problem
We present an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning. In this approach, we train a single policy model that finds near-optimal solutions for a broad range of problem instances of similar size, only by observing the reward signals and following feasibility rules. We consider a parameterized stochastic policy, and by applying a policy gradient algorithm to optimize its parameters, the trained model produces the solution as a sequence of consecutive actions in real time, without the need to re-train for every new problem instance. On capacitated VRP, our approach outperforms classical heuristics and Google's OR-Tools on medium-sized instances in solution quality with comparable computation time (after training). We demonstrate how our approach can handle problems with split delivery and explore the effect of such deliveries on the solution quality.
Learning to delegate for large-scale vehicle routing
While previous heuristic or learning-based works achieve decent solutions on small problem instances, their performance deteriorates in large problems. This article presents a novel learning-augmented local search framework to solve large-scale VRP. The method iteratively improves the solution by identifying appropriate subproblems and $delegating$ their improvement to a black box subsolver. At each step, we leverage spatial locality to consider only a linear number of subproblems, rather than exponential. We frame subproblem selection as regression and train a Transformer on a generated training set of problem instances. Our method accelerates state-of-the-art VRP solvers by 10x to 100x while achieving competitive solution qualities for VRPs with sizes ranging from 500 to 3000. Learned subproblem selection offers a 1.5x to 2x speedup over heuristic or random selection. Our results generalize to a variety of VRP distributions, variants, and solvers.
Reinforcement Learning for Solving the Vehicle Routing Problem
We present an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning. In this approach, we train a single policy model that finds near-optimal solutions for a broad range of problem instances of similar size, only by observing the reward signals and following feasibility rules. We consider a parameterized stochastic policy, and by applying a policy gradient algorithm to optimize its parameters, the trained model produces the solution as a sequence of consecutive actions in real time, without the need to re-train for every new problem instance. On capacitated VRP, our approach outperforms classical heuristics and Google's OR-Tools on medium-sized instances in solution quality with comparable computation time (after training). We demonstrate how our approach can handle problems with split delivery and explore the effect of such deliveries on the solution quality.
Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms
Greenberg, Ido, Sielski, Piotr, Linsenmaier, Hugo, Gandham, Rajesh, Mannor, Shie, Fender, Alex, Chechik, Gal, Meirom, Eli
Vehicle Routing Problems (VRP) are an extension of the Traveling Salesperson Problem and are a fundamental NP - hard challenge in combinatorial optimization. Solving VRP in real - time at large scale has become critical in numerous applications, from growing markets like last - mile delivery to emerging use - cases like interactive logistics planning. In many applications, one has to repeatedly solv e VRP instances dr a wn from the same distribution, yet current state - of - the - art solvers treat each instance on its own without leveraging previous examples . We introduce a n optimization framework where a reinforcement learning agent is trained on prior instances and quickly generate s initial solutions, which are then further optimized by a genetic algorithm. This framework, Evolutionary Algorithm with Reinforcement Learning Initialization ( EARLI), consistently outperforms current state - of - the - art solvers across various time budgets . For example, EARLI handles vehicle routing with 500 locations within one second, 10x faster than current solvers for the same solution quality, enabling real - time and interactive routing at scale . EARLI can generalize to new data, as we demonstrate on real e - commerce delivery data of a previously unseen city . By combin ing reinforcement learning and genetic algorithms, o ur hybrid framework takes a step forward to closer interdisciplinary collaboration between AI and optimization communities towards real - time optimization in diverse domains .
Lifelong Learner: Discovering Versatile Neural Solvers for Vehicle Routing Problems
Feng, Shaodi, Lin, Zhuoyi, Zhou, Jianan, Zhang, Cong, Li, Jingwen, Chen, Kuan-Wen, Jayavelu, Senthilnath, Ong, Yew-Soon
Deep learning has been extensively explored to solve vehicle routing problems (VRPs), which yields a range of data-driven neural solvers with promising outcomes. However, most neural solvers are trained to tackle VRP instances in a relatively monotonous context, e.g., simplifying VRPs by using Euclidean distance between nodes and adhering to a single problem size, which harms their off-the-shelf application in different scenarios. To enhance their versatility, this paper presents a novel lifelong learning framework that incrementally trains a neural solver to manage VRPs in distinct contexts. Specifically, we propose a lifelong learner (LL), exploiting a Transformer network as the backbone, to solve a series of VRPs. The inter-context self-attention mechanism is proposed within LL to transfer the knowledge obtained from solving preceding VRPs into the succeeding ones. On top of that, we develop a dynamic context scheduler (DCS), employing the cross-context experience replay to further facilitate LL looking back on the attained policies of solving preceding VRPs. Extensive results on synthetic and benchmark instances (problem sizes up to 18k) show that our LL is capable of discovering effective policies for tackling generic VRPs in varying contexts, which outperforms other neural solvers and achieves the best performance for most VRPs.
Demand Selection for VRP with Emission Quota
Najar, Farid, Barth, Dominique, Strozecki, Yann
Combinatorial optimization (CO) problems are traditionally addressed using Operations Research (OR) methods, including metaheuristics. In this study, we introduce a demand selection problem for the V ehicle Routing Problem (VRP) with an emission quota, referred to as QVRP. The objective is to minimize the number of omitted deliveries while respecting the pollution quota. We focus on the demand selection part, called Maximum Feasible V ehicle Assignment (MFV A), while the construction of a routing for the VRP instance is solved using classical OR methods. We propose several methods for selecting the packages to omit, both from machine learning (ML) and OR. Our results show that, in this static problem setting, classical OR-based methods consistently outperform ML-based approaches.
Learning to delegate for large-scale vehicle routing
While previous heuristic or learning-based works achieve decent solutions on small problem instances, their performance deteriorates in large problems. This article presents a novel learning-augmented local search framework to solve large-scale VRP. The method iteratively improves the solution by identifying appropriate subproblems and delegating their improvement to a black box subsolver. At each step, we leverage spatial locality to consider only a linear number of subproblems, rather than exponential. We frame subproblem selection as regression and train a Transformer on a generated training set of problem instances.