This article develops a deep reinforcement learning (Deep-RL) framework for dynamic pricing on managed lanes with multiple access locations and heterogeneity in travelers' value of time, origin, and destination. This framework relaxes assumptions in the literature by considering multiple origins and destinations, multiple access locations to the managed lane, en route diversion of travelers, partial observability of the sensor readings, and stochastic demand and observations. The problem is formulated as a partially observable Markov decision process (POMDP) and policy gradient methods are used to determine tolls as a function of real-time observations. Tolls are modeled as continuous and stochastic variables, and are determined using a feedforward neural network. The method is compared against a feedback control method used for dynamic pricing. We show that Deep-RL is effective in learning toll policies for maximizing revenue, minimizing total system travel time, and other joint weighted objectives, when tested on real-world transportation networks. The Deep-RL toll policies outperform the feedback control heuristic for the revenue maximization objective by generating revenues up to 9.5% higher than the heuristic and for the objective minimizing total system travel time (TSTT) by generating TSTT up to 10.4% lower than the heuristic. We also propose reward shaping methods for the POMDP to overcome the undesired behavior of toll policies, like the jam-and-harvest behavior of revenue-maximizing policies. Additionally, we test transferability of the algorithm trained on one set of inputs for new input distributions and offer recommendations on real-time implementations of Deep-RL algorithms. The source code for our experiments is available online at https://github.com/venktesh22/ExpressLanes_Deep-RL
In this paper, we consider same-day delivery with a heterogeneous fleet of vehicles and drones. Customers make delivery requests over the course of the day and the dispatcher dynamically dispatches vehicles and drones to deliver the goods to customers before their delivery deadline. Vehicles can deliver multiple packages in one route but travel relatively slowly due to the urban traffic. Drones travel faster, but they have limited capacity and require charging or battery swaps. To exploit the different strengths of the fleets, we propose a deep Q-learning approach. Our method learns the value of assigning a new customer to either drones or vehicles as well as the option to not offer service at all. To aid feature selection, we present an analytical analysis that demonstrates the role that different types of information have on the value function and decision making. In a systematic computational analysis, we show the superiority of our policy compared to benchmark policies and the effectiveness of our deep Q-learning approach.
Hot on the heels of Google's robot cars, a team of German researchers at AutoNOMOS Labs in Berlin's Free University have upped the ante and unveiled the driverless taxi. Imagine never arguing about the most efficient route or mentally debating the merits of tipping a driver whose ineptitude at the wheel almost killed you. Made in Germany (MIG) is an autonomous Volkswagen Passat cab you hail using an iPad app, and it eliminates the most unappealing aspect of taxis: the driver. MIG is equipped with GPS navigation, video cameras, laser scanners, sensors and radars that it uses to construct a 3D map of its surroundings. It uses this map to detect pedestrians and other vehicles as it navigates the road.
Short-term demand forecasting models commonly combine convolutional and recurrent layers to extract complex spatiotemporal patterns in data. Long-term histories are also used to consider periodicity and seasonality patterns as time series data. In this study, we propose an efficient architecture, Temporal-Guided Network (TGNet), which utilizes graph networks and temporal-guided embedding. Graph networks extract invariant features to permutations of adjacent regions instead of convolutional layers. Temporal-guided embedding explicitly learns temporal contexts from training data and is substituted for the input of long-term histories from days/weeks ago. TGNet learns an autoregressive model, conditioned on temporal contexts of forecasting targets from temporal-guided embedding. Finally, our model achieves competitive performances with other baselines on three spatiotemporal demand dataset from real-world, but the number of trainable parameters is about 20 times smaller than a state-of-the-art baseline. We also show that temporal-guided embedding learns temporal contexts as intended and TGNet has robust forecasting performances even to atypical event situations.
Emerging transportation modes, including car-sharing, bike-sharing, and ride-hailing, are transforming urban mobility but have been shown to reinforce socioeconomic inequities. Spatiotemporal demand prediction models for these new mobility regimes must therefore consider fairness as a first-class design requirement. We present FairST, a fairness-aware model for predicting demand for new mobility systems. Our approach utilizes 1D, 2D and 3D convolutions to integrate various urban features and learn the spatial-temporal dynamics of a mobility system, but we include fairness metrics as a form of regularization to make the predictions more equitable across demographic groups. We propose two novel spatiotemporal fairness metrics, a region-based fairness gap (RFG) and an individual-based fairness gap (IFG). Both quantify equity in a spatiotemporal context, but vary by whether demographics are labeled at the region level (RFG) or whether population distribution information is available (IFG). Experimental results on real bike share and ride share datasets demonstrate the effectiveness of the proposed model: FairST not only reduces the fairness gap by more than 80%, but can surprisingly achieve better accuracy than state-of-the-art yet fairness-oblivious methods including LSTMs, ConvLSTMs, and 3D CNN.