Goto

Collaborating Authors

 Sun, Lijun


Bayesian Temporal Factorization for Multidimensional Time Series Prediction

arXiv.org Machine Learning

Abstract--Large-scale and multidimensional spatiotemporal data sets are becoming ubiquitous in many real-world applications such as monitoring urban traffic and air quality . Making predictions on these time series has become a critical challenge due to not only the large-scale and high-dimensional nature but also the considerable amount of missing data. In this paper, we propose a Bayesian temporal factorization (BTF) framework for modeling multidimensional time series--in particular spatiotemporal data--in the presence of missing values. By integrating low-rank matrix/tensor factorization and vector autoregressive (VAR) process into a single probabilistic graphical model, this framework can characterize both global and local consistencies in large-scale time series data. The graphical model allows us to effectively perform probabilistic predictions and produce uncertainty estimates without imputing those missing values. We develop efficient Gibbs sampling algorithms for model inference and test the proposed BTF framework on several real-world spatiotemporal data sets for both missing data imputation and short-term/long-term rolling prediction tasks. The numerical experiments demonstrate the superiority of the proposed BTF approaches over many state-of-the-art techniques. With recent advances in sensing technologies, large-scale and multidimensional time series data--in particular spatiotemporal data--are collected on a continuous basis from various types of sensors and applications. Making predictions on these time series, such as forecasting urban traffic states and regional air quality, serves as a foundation to many real-world applications and benefits many scientific fields [1], [2]. For example, predicting the demand and states (e.g., speed, flow) of urban traffic is essential to a wide range of intelligent transportation systems (ITS) applications, such trip planning, travel time estimation, route planning, traffic signal control, to name but a few [3]. However, given the complex spatiotemporal dependencies in these data sets, making efficient and reliable predictions for real-time applications has been a longstanding and fundamental research challenge. Despite the vast body of literature on time series analysis from many scientific areas, three emerging issues in modern sensing technologies are constantly challenging the classical modeling frameworks. First, modern time series data are often large-scale, collected from a large number of subjects/locations/sensors simultaneously .