Goto

Collaborating Authors

 Chen, Xuewen


An Actor-Critic-Attention Mechanism for Deep Reinforcement Learning in Multi-view Environments

arXiv.org Machine Learning

In reinforcement learning algorithms, leveraging multiple views of the environment can improve the learning of complicated policies. In multi-view environments, due to the fact that the views may frequently suffer from partial observability, their level of importance are often different. In this paper, we propose a deep reinforcement learning method and an attention mechanism in a multi-view environment. Each view can provide various representative information about the environment. Through our attention mechanism, our method generates a single feature representation of environment given its multiple views. It learns a policy to dynamically attend to each view based on its importance in the decision-making process. Through experiments, we show that our method outperforms its state-of-the-art baselines on TORCS racing car simulator and three other complex 3D environments with obstacles. We also provide experimental results to evaluate the performance of our method on noisy conditions and partial observation settings.


Attention-based Deep Reinforcement Learning for Multi-view Environments

arXiv.org Machine Learning

In reinforcement learning algorithms, it is a common practice to account for only a single view of the environment to make the desired decisions; however, utilizing multiple views of the environment can help to promote the learning of complicated policies. Since the views may frequently suffer from partial observability, their provided observation can have different levels of importance. In this paper, we present a novel attention-based deep reinforcement learning method in a multi-view environment in which each view can provide various representative information about the environment. Specifically, our method learns a policy to dynamically attend to views of the environment based on their importance in the decision-making process. We evaluate the performance of our method on TORCS racing car simulator and three other complex 3D environments with obstacles.


Optimizing Taxi Carpool Policies via Reinforcement Learning and Spatio-Temporal Mining

arXiv.org Artificial Intelligence

Abstract--In this paper, we develop a reinforcement learning (RL) based system to learn an effective policy for carpooling that maximizes transportation efficiency so that fewer cars are required to fulfill the given amount of trip demand. For this purpose, first, we develop a deep neural network model, called ST-NN (Spatio-Temporal Neural Network), to predict taxi trip time from the raw GPS trip data. Secondly, we develop a carpooling simulation environment for RL training, with the output of ST-NN and using the NYC taxi trip dataset. In order to maximize transportation efficiency and minimize traffic congestion, we choose the effective distance covered by the driver on a carpool trip as the reward. Therefore, the more effective distance a driver achieves over a trip (i.e. to satisfy more trip demand) the higher the efficiency and the less will be the traffic congestion. We compared the performance of RL learned policy to a fixed policy (which always accepts carpool) as a baseline and obtained promising results that are interpretable and demonstrate the advantage of our RL approach. We also compare the performance of ST-NN to that of state-of-the-art travel time estimation methods and observe that ST-NN significantly improves the prediction performance and is more robust to outliers. In rapidly expanding metropolitan cities, taxis (which include cars working with ride-sharing platforms such as Uber, Lyft and DiDi) play a vital role in residents' daily commute among all the available modes of transportation [1]. Based on a survey in NYC [2], there is a stable demand of taxis, by 666, 000 passengers per day, which is fulfilled by more than 13, 000 taxis in the region. For these expanding cities, to meet the increasing demand of taxis, an emerging problem is to efficiently utilize the existing road networks to reduce potential traffic congestions and to optimize the effective travel time and distance.


A Unified Neural Network Approach for Estimating Travel Time and Distance for a Taxi Trip

arXiv.org Machine Learning

In building intelligent transportation systems such as taxi or rideshare services, accurate prediction of travel time and distance is crucial for customer experience and resource management. Using the NYC taxi dataset, which contains taxi trips data collected from GPS-enabled taxis [23], this paper investigates the use of deep neural networks to jointly predict taxi trip time and distance. We propose a model, called ST-NN (Spatio-Temporal Neural Network), which first predicts the travel distance between an origin and a destination GPS coordinate, then combines this prediction with the time of day to predict the travel time. The beauty of ST-NN is that it uses only the raw trips data without requiring further feature engineering and provides a joint estimate of travel time and distance. We compare the performance of ST-NN to that of state-of-the-art travel time estimation methods, and we observe that the proposed approach generalizes better than state-of-the-art methods. We show that ST-NN approach significantly reduces the mean absolute error for both predicted travel time and distance, about 17% for travel time prediction. We also observe that the proposed approach is more robust to outliers present in the dataset by testing the performance of ST-NN on the datasets with and without outliers.


Learning Deep Networks from Noisy Labels with Dropout Regularization

arXiv.org Machine Learning

Large datasets often have unreliable labels-such as those obtained from Amazon's Mechanical Turk or social media platforms-and classifiers trained on mislabeled datasets often exhibit poor performance. We present a simple, effective technique for accounting for label noise when training deep neural networks. We augment a standard deep network with a softmax layer that models the label noise statistics. Then, we train the deep network and noise model jointly via end-to-end stochastic gradient descent on the (perhaps mislabeled) dataset. The augmented model is overdetermined, so in order to encourage the learning of a non-trivial noise model, we apply dropout regularization to the weights of the noise model during training. Numerical experiments on noisy versions of the CIFAR-10 and MNIST datasets show that the proposed dropout technique outperforms state-of-the-art methods.