Tsinghua University, Institute for Interdisciplinary Information Sciences
When Will You Arrive? Estimating Travel Time Based on Deep Neural Networks
Wang, Dong (Duke University) | Zhang, Junbo (Microsoft Research) | Cao, Wei (Tsinghua University, Institute for Interdisciplinary Information Sciences) | Li, Jian (Tsinghua University, Institute for Interdisciplinary Information Sciences) | Zheng, Yu (Microsoft Research)
Estimating the travel time of any path (denoted by a sequence of connected road segments) in a city is of great importance to traffic monitoring, route planning, ridesharing, taxi/Uber dispatching, etc. However, it is a very challenging problem, affected by diverse complex factors, including spatial correlations, temporal dependencies, external conditions (e.g. weather, traffic lights). Prior work usually focuses on estimating the travel times of individual road segments or sub-paths and then summing up these times, which leads to an inaccurate estimation because such approaches do not consider road intersections/traffic lights, and local errors may accumulate. To address these issues, we propose an end-to-end Deep learning framework for Travel Time Estimation called DeepTTE that estimates the travel time of the whole path directly. More specifically, we present a geo-convolution operation by integrating the geographic information into the classical convolution, capable of capturing spatial correlations. By stacking recurrent unit on the geo-convoluton layer, our DeepTTE can capture the temporal dependencies simultaneously. A multi-task learning component is given on the top of DeepTTE, that estimates the travel time of both the entire path and each local path simultaneously during the training phase. The extensive experiments on two large-scale datasets shows our DeepTTE significantly outperforms the state-of-the-art methods.
An Optimal Online Method of Selecting Source Policies for Reinforcement Learning
Li, Siyuan (Tsinghua University, Institute for Interdisciplinary Information Sciences) | Zhang, Chongjie (Tsinghua University, Institute for Interdisciplinary Information Sciences)
Transfer learning significantly accelerates the reinforcement learning process by exploiting relevant knowledge from previous experiences. The problem of optimally selecting source policies during the learning process is of great importance yet challenging. There has been little theoretical analysis of this problem. In this paper, we develop an optimal online method to select source policies for reinforcement learning. This method formulates online source policy selection as a multi-armed bandit problem and augments Q-learning with policy reuse. We provide theoretical guarantees of the optimal selection process and convergence to the optimal policy. In addition, we conduct experiments on a grid-based robot navigation domain to demonstrate its efficiency and robustness by comparing to the state-of-the-art transfer learning method.