best-known solution
Anytime Multi-Agent Path Finding using Operation Parallelism in Large Neighborhood Search
Chan, Shao-Hung, Chen, Zhe, Lin, Dian-Lun, Zhang, Yue, Harabor, Daniel, Huang, Tsung-Wei, Koenig, Sven, Phan, Thomy
Multi-Agent Path Finding (MAPF) is the problem of finding a set of collision-free paths for multiple agents in a shared environment while minimizing the sum of travel time. Since solving the MAPF problem optimally is NP-hard, anytime algorithms based on Large Neighborhood Search (LNS) are promising to find good-quality solutions in a scalable way by iteratively destroying and repairing the paths. We propose Destroy-Repair Operation Parallelism for LNS (DROP-LNS), a parallel framework that performs multiple destroy and repair operations concurrently to explore more regions of the search space within a limited time budget. Unlike classic MAPF approaches, DROP-LNS can exploit parallelized hardware to improve the solution quality. We also formulate two variants of parallelism and conduct experimental evaluations. The results show that DROP-LNS significantly outperforms the state-of-the-art and the variants.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
Applying Deep Reinforcement Learning to the HP Model for Protein Structure Prediction
Yang, Kaiyuan, Huang, Houjing, Vandans, Olafs, Murali, Adithya, Tian, Fujia, Yap, Roland H. C., Dai, Liang
A central problem in computational biophysics is protein structure prediction, i.e., finding the optimal folding of a given amino acid sequence. This problem has been studied in a classical abstract model, the HP model, where the protein is modeled as a sequence of H (hydrophobic) and P (polar) amino acids on a lattice. The objective is to find conformations maximizing H-H contacts. It is known that even in this reduced setting, the problem is intractable (NP-hard). In this work, we apply deep reinforcement learning (DRL) to the two-dimensional HP model. We can obtain the conformations of best known energies for benchmark HP sequences with lengths from 20 to 50. Our DRL is based on a deep Q-network (DQN). We find that a DQN based on long short-term memory (LSTM) architecture greatly enhances the RL learning ability and significantly improves the search process. DRL can sample the state space efficiently, without the need of manual heuristics. Experimentally we show that it can find multiple distinct best-known solutions per trial. This study demonstrates the effectiveness of deep reinforcement learning in the HP model for protein folding.
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > Singapore (0.04)
- (4 more...)
Learned upper bounds for the Time-Dependent Travelling Salesman Problem
Adamo, Tommaso, Ghiani, Gianpaolo, Greco, Pierpaolo, Guerriero, Emanuela
Given a graph whose arc traversal times vary over time, the Time-Dependent Travelling Salesman Problem consists in finding a Hamiltonian tour of least total duration covering the vertices of the graph. The main goal of this work is to define tight upper bounds for this problem by reusing the information gained when solving instances with similar features. This is customary in distribution management, where vehicle routes have to be generated over and over again with similar input data. To this aim, we devise an upper bounding technique based on the solution of a classical (and simpler) time-independent Asymmetric Travelling Salesman Problem, where the constant arc costs are suitably defined by the combined use of a Linear Program and a mix of unsupervised and supervised Machine Learning techniques. The effectiveness of this approach has been assessed through a computational campaign on the real travel time functions of two European cities: Paris and London. The overall average gap between our heuristic and the best-known solutions is about 0.001\%. For 31 instances, new best solutions have been obtained.
- North America > United States > California > Alameda County > Oakland (0.04)
- Europe > Italy (0.04)
Fast Approximate Solutions using Reinforcement Learning for Dynamic Capacitated Vehicle Routing with Time Windows
Sultana, Nazneen N, Baniwal, Vinita, Basumatary, Ansuma, Mittal, Piyush, Ghosh, Supratim, Khadilkar, Harshad
This paper develops an inherently parallelised, fast, approximate learning-based solution to the generic class of Capacitated Vehicle Routing with Time Windows and Dynamic Routing (CVRP-TWDR). Considering vehicles in a fleet as decentralised agents, we postulate that using reinforcement learning (RL) based adaptation is a key enabler for real-time route formation in a dynamic environment. The methodology allows each agent (vehicle) to independently evaluate the value of serving each customer, and uses a centralised allocation heuristic to finalise the allocations based on the generated values. We show that the solutions produced by this method on standard datasets are significantly faster than exact formulations and state-of-the-art meta-heuristics, while being reasonably close to optimal in terms of solution quality. We describe experiments in both the static case (when all customer demands and time windows are known in advance) as well as the dynamic case (where customers can `pop up' at any time during execution). The results with a single trained model on large, out-of-distribution test data demonstrate the scalability and flexibility of the proposed approach.
Hybrid 2-stage Imperialist Competitive Algorithm with Ant Colony Optimization for Solving Multi-Depot Vehicle Routing Problem
Dzalbs, Ivars, Kalganova, Tatiana
The Multi-Depot Vehicle Routing Problem (MDVRP) is a real-world model of the simplistic Vehicle Routing Problem (VRP) that considers how to satisfy multiple customer demands from numerous depots. This paper introduces a hybrid 2-stage approach based on two population-based algorithms - Ant Colony Optimization (ACO) that mimics ant behaviour in nature and the Imperialist Competitive Algorithm (ICA) that is based on geopolitical relationships between countries. In the proposed hybrid algorithm, ICA is responsible for customer assignment to the depots while ACO is routing and sequencing the customers. The algorithm is compared to non-hybrid ACO and ICA as well as four other state-of-the-art methods across 23 common Cordreau's benchmark instances. Results show clear improvement over simple ACO and ICA and demonstrate very competitive results when compared to other rival algorithms.
- Europe > United Kingdom (0.04)
- Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
Exploiting Promising Sub-Sequences of Jobs to solve the No-Wait Flowshop Scheduling Problem
Mousin, Lucien, Kessaci, Marie-Eléonore, Dhaenens, Clarisse
The no-wait flowshop scheduling problem is a variant of the classical permutation flowshop problem, with the additional constraint that jobs have to be processed by the successive machines without waiting time. To efficiently address this NP-hard combinatorial optimization problem we conduct an analysis of the structure of good quality solutions. This analysis shows that the No-Wait specificity gives them a common structure: they share identical sub-sequences of jobs, we call super-jobs. After a discussion on the way to identify these super-jobs, we propose IG-SJ, an algorithm that exploits super-jobs within the state-of-the-art algorithm for the classical permutation flowshop, the well-known Iterated Greedy (IG) algorithm. An iterative approach of IG-SJ is also proposed. Experiments are conducted on Taillard's instances. The experimental results show that exploiting super-jobs is successful since IG-SJ is able to find 64 new best solutions.