This article develops a deep reinforcement learning (Deep-RL) framework for dynamic pricing on managed lanes with multiple access locations and heterogeneity in travelers' value of time, origin, and destination. This framework relaxes assumptions in the literature by considering multiple origins and destinations, multiple access locations to the managed lane, en route diversion of travelers, partial observability of the sensor readings, and stochastic demand and observations. The problem is formulated as a partially observable Markov decision process (POMDP) and policy gradient methods are used to determine tolls as a function of real-time observations. Tolls are modeled as continuous and stochastic variables, and are determined using a feedforward neural network. The method is compared against a feedback control method used for dynamic pricing. We show that Deep-RL is effective in learning toll policies for maximizing revenue, minimizing total system travel time, and other joint weighted objectives, when tested on real-world transportation networks. The Deep-RL toll policies outperform the feedback control heuristic for the revenue maximization objective by generating revenues up to 9.5% higher than the heuristic and for the objective minimizing total system travel time (TSTT) by generating TSTT up to 10.4% lower than the heuristic. We also propose reward shaping methods for the POMDP to overcome the undesired behavior of toll policies, like the jam-and-harvest behavior of revenue-maximizing policies. Additionally, we test transferability of the algorithm trained on one set of inputs for new input distributions and offer recommendations on real-time implementations of Deep-RL algorithms. The source code for our experiments is available online at https://github.com/venktesh22/ExpressLanes_Deep-RL
WASHINGTON, DC (March 8, 2017)--Interventional radiologists at the University of California at Los Angeles (UCLA) are using technology found in self-driving cars to power a machine learning application that helps guide patients' interventional radiology care, according to research presented today at the Society of Interventional Radiology's 2017 Annual Scientific Meeting. The researchers used cutting-edge artificial intelligence to create a "chatbot" interventional radiologist that can automatically communicate with referring clinicians and quickly provide evidence-based answers to frequently asked questions. This allows the referring physician to provide real-time information to the patient about the next phase of treatment, or basic information about an interventional radiology treatment. "We theorized that artificial intelligence could be used in a low-cost, automated way in interventional radiology as a way to improve patient care," said Edward W. Lee, M.D., Ph.D., assistant professor of radiology at UCLA's David Geffen School of Medicine and one of the authors of the study. "Because artificial intelligence has already begun transforming many industries, it has great potential to also transform health care."
Emerging transportation modes, including car-sharing, bike-sharing, and ride-hailing, are transforming urban mobility but have been shown to reinforce socioeconomic inequities. Spatiotemporal demand prediction models for these new mobility regimes must therefore consider fairness as a first-class design requirement. We present FairST, a fairness-aware model for predicting demand for new mobility systems. Our approach utilizes 1D, 2D and 3D convolutions to integrate various urban features and learn the spatial-temporal dynamics of a mobility system, but we include fairness metrics as a form of regularization to make the predictions more equitable across demographic groups. We propose two novel spatiotemporal fairness metrics, a region-based fairness gap (RFG) and an individual-based fairness gap (IFG). Both quantify equity in a spatiotemporal context, but vary by whether demographics are labeled at the region level (RFG) or whether population distribution information is available (IFG). Experimental results on real bike share and ride share datasets demonstrate the effectiveness of the proposed model: FairST not only reduces the fairness gap by more than 80%, but can surprisingly achieve better accuracy than state-of-the-art yet fairness-oblivious methods including LSTMs, ConvLSTMs, and 3D CNN.
ABSTRACT Vision-based navigation of modern autonomous vehicles primarily depends on Deep Neural Network (DNN) based systems in which the controller obtains input from sensors/detectors such as cameras, and produces an output such as a steering wheel angle to navigate the vehicle safely in roadway traffic. Typically, these DNN-based systems are trained through supervised and/or transfer learning; however, recent studies show that these systems can be compromised by perturbation or adversarial input features on the trained DNN-based models. Similarly, this perturbation can be introduced into the autonomous vehicle DNN-based system by roadway hazards such as debris and roadblocks. In this study, we first introduce a roadway hazardous environment (both intentional and unintentional) that can compromise the DNN-based system of an autonomous vehicle, producing an incorrect vehicle navigational output such as a steering wheel angle, which can cause crashes resulting in fatality and injury. Then, we develop an approach based on object detection and semantic segmentation to mitigate the adverse effect of this hazardous environment, one that helps the autonomous vehicle to navigate safely around such hazards. This study finds the DNN-based model with hazardous object detection, and semantic segmentation improves the ability of an autonomous vehicle to avoid potential crashes by 21% compared to the traditional DNN-based autonomous driving system.
This study proposes a framework for human-like autonomous car-following planning based on deep reinforcement learning (deep RL). Historical driving data are fed into a simulation environment where an RL agent learns from trial and error interactions based on a reward function that signals how much the agent deviates from the empirical data. Through these interactions, an optimal policy, or car-following model that maps in a human-like way from speed, relative speed between a lead and following vehicle, and inter-vehicle spacing to acceleration of a following vehicle is finally obtained. The model can be continuously updated when more data are fed in. Two thousand car-following periods extracted from the 2015 Shanghai Naturalistic Driving Study were used to train the model and compare its performance with that of traditional and recent data-driven car-following models. As shown by this study results, a deep deterministic policy gradient car-following model that uses disparity between simulated and observed speed as the reward function and considers a reaction delay of 1s, denoted as DDPGvRT, can reproduce human-like car-following behavior with higher accuracy than traditional and recent data-driven car-following models. Specifically, the DDPGvRT model has a spacing validation error of 18% and speed validation error of 5%, which are less than those of other models, including the intelligent driver model, models based on locally weighted regression, and conventional neural network-based models. Moreover, the DDPGvRT demonstrates good capability of generalization to various driving situations and can adapt to different drivers by continuously learning. This study demonstrates that reinforcement learning methodology can offer insight into driver behavior and can contribute to the development of human-like autonomous driving algorithms and traffic-flow models.