Modeling Severe Traffic Accidents With Spatial And Temporal Features Machine Learning

We present an approach to estimate the severity of traffic related accidents in aggregated (area-level) and disaggregated (point level) data. Exploring spatial features, we measure complexity of road networks using several area level variables. Also using temporal and other situational features from open data for New York City, we use Gradient Boosting models for inference and measuring feature importance along with Gaussian Processes to model spatial dependencies in the data. The results show significant importance of complexity in aggregated model as well as as other features in prediction which may be helpful in framing policies and targeting interventions for preventing severe traffic related accidents and injuries.

Learning Deep Representation from Big and Heterogeneous Data for Traffic Accident Inference

AAAI Conferences

With the rapid development of urbanization and public transportation system, the number of traffic accidents have significantly increased globally over the past decades and become a big problem for human society. Facing these possible and unexpected traffic accidents, understanding what causes traffic accident and early alarms for some possible ones will play a critical role on planning effective traffic management. However, due to the lack of supported sensing data, research is very limited on the field of updating traffic accident risk in real-time. Therefore, in this paper, we collect big and heterogeneous data (7 months traffic accident data and 1.6 million users' GPS records) to understand how human mobility will affect traffic accident risk. By mining these data, we develop a deep model of Stack denoise Autoencoder to learn hierarchical feature representation of human mobility. And these features are used for efficient prediction of traffic accident risk level. Once the model has been trained, our model can simulate corresponding traffic accident risk map with given real-time input of human mobility. The experimental results demonstrate the efficiency of our model and suggest that traffic accident risk can be significantly more predictable through human mobility.

Risk-Aware Planning: Methods and Case Study for Safer Driving Routes

AAAI Conferences

Vehicle crashes account for over one million fatalities and many more million injuries annually worldwide. Some roads are safer than others, so a driving route optimized for safety may reduce the number of crashes. We have developed a method to estimate the probability of a crash on any road as a function of the traffic volume, road characteristics, and environmental conditions. We trained a regression model to estimate traffic volume and a binary classifier to estimate crash probability on road segments. Modeling a route’s crash probability as a series of Bernoulli probability trials, we show how to use a simple Dijkstra algorithm to compute the safest route between two locations. Compared to the fastest route, the safest route averages about 1.7 times as long in duration and about half as dangerous. We also show how to smoothly trade off safety for time, giving several different route options with different crash probabilities and durations.

Accident Risk Prediction based on Heterogeneous Sparse Data: New Dataset and Insights Machine Learning

Reducing traffic accidents is an important public safety challenge, therefore, accident analysis and prediction has been a topic of much research over the past few decades. Using small-scale datasets with limited coverage, being dependent on extensive set of data, and being not applicable for real-time purposes are the important shortcomings of the existing studies. To address these challenges, we propose a new solution for real-time traffic accident prediction using easy-to-obtain, but sparse data. Our solution relies on a deep-neural-network model (which we have named DAP, for Deep Accident Prediction); which utilizes a variety of data attributes such as traffic events, weather data, points-of-interest, and time. DAP incorporates multiple components including a recurrent (for time-sensitive data), a fully connected (for time-insensitive data), and a trainable embedding component (to capture spatial heterogeneity). To fill the data gap, we have - through a comprehensive process of data collection, integration, and augmentation - created a large-scale publicly available database of accident information named US-Accidents. By employing the US-Accidents dataset and through an extensive set of experiments across several large cities, we have evaluated our proposal against several baselines. Our analysis and results show significant improvements to predict rare accident events. Further, we have shown the impact of traffic information, time, and points-of-interest data for real-time accident prediction.

High-Resolution Road Vehicle Collision Prediction for the City of Montreal Machine Learning

Road accidents are an important issue of our modern societies, responsible for millions of deaths and injuries every year in the world. In Quebec only, road accidents are responsible for hundreds of deaths and tens of thousands of injuries. In this paper, we show how one can leverage open datasets of a city like Montreal, Canada, to create high-resolution accident prediction models, using state-of-the-art big data analytics. Compared to other studies in road accident prediction, we have a much higher prediction resolution, i.e., our models predict the occurrence of an accident within an hour, on road segments defined by intersections. Such models could be used in the context of road accident prevention, but also to identify key factors that can lead to a road accident, and consequently, help elaborate new policies. We tested various machine learning methods to deal with the severe class imbalance inherent to accident prediction problems. In particular, we implemented the Balanced Random Forest algorithm, a variant of the Random Forest machine learning algorithm in Apache Spark. Experimental results show that 85% of road vehicle collisions are detected by our model with a false positive rate of 13%. The examples identified as positive are likely to correspond to high-risk situations. In addition, we identify the most important predictors of vehicle collisions for the area of Montreal: the count of accidents on the same road segment during previous years, the temperature, the day of the year, the hour and the visibility.