Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem
Holler, John, Vuorio, Risto, Qin, Zhiwei, Tang, Xiaocheng, Jiao, Yan, Jin, Tiancheng, Singh, Satinder, Wang, Chenxi, Ye, Jieping
–arXiv.org Artificial Intelligence
--Order dispatching and driver repositioning (also known as fleet management) in the face of spatially and temporally varying supply and demand are central to a ride-sharing platform marketplace. Handcrafting heuristic solutions that account for the dynamics in these resource allocation problems is difficult, and may be better handled by an end-to-end machine learning method. Previous works have explored machine learning methods to the problem from a high-level perspective, where the learning method is responsible for either repositioning the drivers or dispatching orders, and as a further simplification, the drivers are considered independent agents maximizing their own reward functions. In this paper we present a deep reinforcement learning approach for tackling the full fleet management and dispatching problems. In addition to treating the drivers as individual agents, we consider the problem from a system-centric perspective, where a central fleet management agent is responsible for decision-making for all drivers. I NTRODUCTION The order dispatching and fleet management system at a ride-sharing company must make decisions both for assigning available drivers to nearby passengers (hereby called orders) and for repositioning drivers who have no nearby orders. These decisions have short-term effects on the revenue generated by the drivers and driver availability. In the long term they change the distribution of drivers across the city, which in turn has a critical impact on how well future orders can be served. Provident algorithmic solutions, which account for the short term and long-term consequences of their decisions can improve the quality of service of the ride-sharing platforms and are thus an important area of research. Recent works [1], [2] have successfully applied Deep Reinforcement Learning (RL) techniques to dispatching problems, such as the Traveling Salesman Problem (TSP) and the more general V ehicle Routing Problem (VRP) [3], however they have primarily focused on static ( i. e. those where all orders are known up front) and/or single-driver dispatching problems. In contrast to these problems, the fleet management and order dispatching problem ride-sharing platforms face has multiple drivers and dynamically changing supply and demand conditions. We refer to this dynamic dispatching and fleet management problem as the Multi-Driver V ehicle Dispatching and Repositioning Problem (MDVDRP). VRPs and other problems similar to the MDVDRP are studied in the field of combinatorial optimization. Exactly solving instances of these problems at the scale of real-world environment is computationally intractable [4].
arXiv.org Artificial Intelligence
Nov-25-2019
- Country:
- North America > United States
- Michigan (0.04)
- Asia
- Middle East > Jordan (0.04)
- China > Sichuan Province
- Chengdu (0.04)
- North America > United States
- Genre:
- Research Report (0.64)
- Industry:
- Transportation
- Passenger (1.00)
- Ground > Road (1.00)
- Freight & Logistics Services (1.00)
- Transportation
- Technology: