A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets

Shi, Chengchun, Wan, Runzhe, Song, Ge, Luo, Shikai, Song, Rui, Zhu, Hongtu

arXiv.org Artificial Intelligence 

This paper concerns the applications in the two-sided markets that involve a group of subjects who are making sequential decisions across time and/or location. In particular, we consider large-scale fleet management in ride-sharing companies, such as Uber, Lyft and Didi. These companies form a typical two-sided market that enables efficient interactions between passengers and drivers (Armstrong, 2006; Rysman, 2009). With the rapid development of smart phones and internet of things, they have substantially transformed the transportation landscape of human beings (Frenken and Schor, 2017; Jin et al., 2018; Hagiu and Wright, 2019). With rich information on passenger demand and locations of taxi drivers, they significantly reduce taxi cruise time and passenger waiting time in comparison to traditional taxi systems (Li et al., 2011; Zhang et al., 2014; Miao et al., 2016). We use the numbers of drivers and call orders to measure the supply and demand at a given time and location. Both supply and demand are spatio-temporal processes and they interact with each other. These processes depend strongly on the platform's policies, and have a huge impact on the platform's outcomes of interest, such as drivers' income level and working time, passengers' satisfaction rate, order answering rate and order finishing rate, etc.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found