Ma, Xuan
Origin-Destination Demand Prediction: An Urban Radiation and Attraction Perspective
Ma, Xuan, Bao, Zepeng, Zhong, Ming, Zhu, Yuanyuan, Li, Chenliang, Jiang, Jiawei, Li, Qing, Qian, Tieyun
--In recent years, origin-destination (OD) demand prediction has gained significant attention for its profound implications in urban development. Existing deep learning methods primarily focus on the spatial or temporal dependency between regions yet neglecting regions' fundamental functional difference. Though physical methods have characterised regions' functions by their radiation and attraction capacities, these functions are defined on numerical factors like population without considering regions' intrinsic nominal attributes, e.g., a region is a residential or industrial district. Moreover, the complicated relationships between two types of capacities, e.g., the radiation capacity of a residential district in the morning will be transformed into the attraction capacity in the evening, are totally missing from physical methods. In this paper, we not only generalize the physical radiation and attraction capacities into the deep learning framework with the extended capability to fulfil regions' functions, but also present a new model that captures the relationships between two types of capacities. Specifically, we first model regions' radiation and attraction capacities using a bilateral branch network, each equipped with regions' attribute representations. We then describe the transformation relationship of different capacities within the same region using a parameter generation method. We finally unveil the competition relationship of different regions with the same attraction capacity through adversarial learning. Extensive experiments on two city datasets demonstrate the consistent improvements of our method over the state-of-the-art baselines, as well as the good explainability of regions' functions using their nominal attributes. With the spread of ride-hailing platforms like Uber and Didi, intelligent transportation systems have emerged as a vibrant research domain [1]-[3]. These systems are designed to offer convenient ride services, improve public transportation efficiency through proactive order assignment, and optimize profitability by identifying high-profit routes based on historical passenger demands [4]. Among the wide spectrum of applications, traffic demand forecasting is the focal point due to its vital role in urban development, traffic control, and route planning [5]-[11]. The conventional task in this field involves the prediction of the potential number of passenger demands in a specific region [10], [12], [13]. However, such a task is unable to capture associations in inter-regional flows. Tieyun Qian is the corresponding author. Figure 1: (a) An illustration of the region partition in Manhattan, New Y ork, and (b) and (c) are visualizations of the taxi outflow and inflow demand in a designated region with a red mark in (a) on 2019-01-17, respectively.
Robust bilinear factor analysis based on the matrix-variate $t$ distribution
Ma, Xuan, Zhao, Jianhua, Shang, Changchun, Jiang, Fen, Yu, Philip L. H.
Factor Analysis based on multivariate $t$ distribution ($t$fa) is a useful robust tool for extracting common factors on heavy-tailed or contaminated data. However, $t$fa is only applicable to vector data. When $t$fa is applied to matrix data, it is common to first vectorize the matrix observations. This introduces two challenges for $t$fa: (i) the inherent matrix structure of the data is broken, and (ii) robustness may be lost, as vectorized matrix data typically results in a high data dimension, which could easily lead to the breakdown of $t$fa. To address these issues, starting from the intrinsic matrix structure of matrix data, a novel robust factor analysis model, namely bilinear factor analysis built on the matrix-variate $t$ distribution ($t$bfa), is proposed in this paper. The novelty is that it is capable to simultaneously extract common factors for both row and column variables of interest on heavy-tailed or contaminated matrix data. Two efficient algorithms for maximum likelihood estimation of $t$bfa are developed. Closed-form expression for the Fisher information matrix to calculate the accuracy of parameter estimates are derived. Empirical studies are conducted to understand the proposed $t$bfa model and compare with related competitors. The results demonstrate the superiority and practicality of $t$bfa. Importantly, $t$bfa exhibits a significantly higher breakdown point than $t$fa, making it more suitable for matrix data.
Robust factored principal component analysis for matrix-valued outlier accommodation and detection
Ma, Xuan, Zhao, Jianhua, Wang, Yue
Principal component analysis (PCA) is a popular dimension reduction technique for vector data. Factored PCA (FPCA) is a probabilistic extension of PCA for matrix data, which can substantially reduce the number of parameters in PCA while yield satisfactory performance. However, FPCA is based on the Gaussian assumption and thereby susceptible to outliers. Although the multivariate $t$ distribution as a robust modeling tool for vector data has a very long history, its application to matrix data is very limited. The main reason is that the dimension of the vectorized matrix data is often very high and the higher the dimension, the lower the breakdown point that measures the robustness. To solve the robustness problem suffered by FPCA and make it applicable to matrix data, in this paper we propose a robust extension of FPCA (RFPCA), which is built upon a $t$-type distribution called matrix-variate $t$ distribution. Like the multivariate $t$ distribution, the matrix-variate $t$ distribution can adaptively down-weight outliers and yield robust estimates. We develop a fast EM-type algorithm for parameter estimation. Experiments on synthetic and real-world datasets reveal that RFPCA is compared favorably with several related methods and RFPCA is a simple but powerful tool for matrix-valued outlier detection.