Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM
Zhu, Rongjie, Zhang, Cong, Cao, Zhiguang
–arXiv.org Artificial Intelligence
While large language models (LLMs) are increasingly used as automated heuristic designers for vehicle routing problems (VRPs), current state-of-the-art methods predominantly rely on prompting massive, general-purpose models like GPT-4. This work challenges that paradigm by demonstrating that a smaller, specialized LLM, when meticulously fine-tuned, can generate components that surpass expert-crafted heuristics within advanced solvers. We propose RFTHGS, a novel Reinforcement learning (RL) framework for Fine-Tuning a small LLM to generate high-performance crossover operators for the Hybrid Genetic Search (HGS) solver, applied to the Capacitated VRP (CVRP). Our method employs a multi-tiered, curriculum-based reward function that progressively guides the LLM to master generating first compilable, then executable, and finally, superior-performing operators that exceed human expert designs. This is coupled with an operator caching mechanism that discourages plagiarism and promotes diversity during training. Comprehensive experiments show that our fine-tuned LLM produces crossover operators which significantly outperform the expert-designed ones in HGS. The performance advantage remains consistent, generalizing from small-scale instances to large-scale problems with up to 1000 nodes. Furthermore, RFTHGS exceeds the performance of leading neuro-combinatorial baselines, prompt-based methods, and commercial LLMs such as GPT-4o and GPT-4o-mini.
arXiv.org Artificial Intelligence
Oct-14-2025
- Country:
- Asia
- China > Jiangsu Province
- Nanjing (0.04)
- Singapore (0.04)
- China > Jiangsu Province
- Europe > Monaco (0.04)
- North America > United States (0.04)
- South America > Chile
- Asia
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Transportation (0.56)