DAN: Decentralized Attention-based Neural Network to Solve the MinMax Multiple Traveling Salesman Problem

Cao, Yuhong, Sun, Zhanhong, Sartoretti, Guillaume

arXiv.org Artificial Intelligence 

The traveling salesman problem (TSP) is a challenging Instead of solving mTSP as a combinatorial optimization, NP-hard problem, where given a group of cities (i.e., nodes) we focus on solving it as a decentralized cooperation problem, of a given graph (often complete), an agent needs to find where agents each construct their own tour towards a complete tour of this graph, i.e., a closed path from a a common objective. To this end, we rely on a threefold given starting node that visits all other nodes exactly once approach: first, we formulate mTSP as a sequential decision with minimal path length. TSP can be further extended to making problem and introduce a decision time gap that allows multiple traveling salesman problem (mTSP), where multiple agents to make decisions asynchronously for enhanced agents collaborate with each other to visit all cities from a collaboration. Second, we propose an attention based neural common starting node. Compared to TSP, mTSP has more network to allow agents to make individual decisions according general real world applications such as last-mile delivery, to their own observations, which provides agents with the UAV patrolling and transportation planning [1]. As classical ability to implicitly predict other agents' future decisions, combinatorial optimization problems, TSP and mTSP are by modeling the dependencies of all the agents and cities. commonly solved using exact or heuristic algorithms. Exact Third, we train our model using multi-agent reinforcement algorithms can theoretically guarantee optimal solutions [1], learning with parameter sharing, which provides our model [2], but rely on centralized, exhaustive planning, and thus do with natural scalability with the number of agents. We note not scale well with the number of agents and cities. On the that these tools are more general than mTSP, and could other hand, heuristic algorithms [1], [3] only find suboptimal extend to other robotic problems that need to address agent solutions but are significantly faster than exact algorithms.