multivariate time sery imputation
Frequency-aware Generative Models for Multivariate Time Series Imputation
Missing data in multivariate time series are common issues that can affect the analysis and downstream applications.Although multivariate time series data generally consist of the trend, seasonal and residual terms, existing works mainly focus on optimizing the modeling for the first two items. However, we find that the residual term is more crucial for getting accurate fillings, since it is more related to the diverse changes of data and the biggest component of imputation errors.Therefore, in this study, we introduce frequency-domain information and design Frequency-aware Generative Models for Multivariate Time Series Imputation (FGTI). Specifically, FGTI employs a high-frequency filter to boost the residual term imputation, supplemented by a dominant-frequency filter for the trend and seasonal imputation. Cross-domain representation learning module then fuses frequency-domain insights with deep representations.Experiments over various datasets with real-world missing values show that FGTI achieves superiority in both data imputation and downstream applications.
Multivariate Time Series Imputation with Generative Adversarial Networks
Multivariate time series usually contain a large number of missing values, which hinders the application of advanced analysis methods on multivariate time series data. Conventional approaches to addressing the challenge of missing values, including mean/zero imputation, case deletion, and matrix factorization-based imputation, are all incapable of modeling the temporal dependencies and the nature of complex distribution in multivariate time series. In this paper, we treat the problem of missing value imputation as data generation. Inspired by the success of Generative Adversarial Networks (GAN) in image generation, we propose to learn the overall distribution of a multivariate time series dataset with GAN, which is further used to generate the missing values for each sample. Different from the image data, the time series data are usually incomplete due to the nature of data recording process.
Physics-incorporated Graph Neural Network for Multivariate Time Series Imputation
Liang, Guojun, Tiwari, Prayag, Nowaczyk, Slawomir, Byttner, Stefan
Exploring the missing values is an essential but challenging issue due to the complex latent spatio-temporal correlation and dynamic nature of time series. Owing to the outstanding performance in dealing with structure learning potentials, Graph Neural Networks (GNNs) and Recurrent Neural Networks (RNNs) are often used to capture such complex spatio-temporal features in multivariate time series. However, these data-driven models often fail to capture the essential spatio-temporal relationships when significant signal corruption occurs. Additionally, calculating the high-order neighbor nodes in these models is of high computational complexity. To address these problems, we propose a novel higher-order spatio-temporal physics-incorporated GNN (HSPGNN). Firstly, the dynamic Laplacian matrix can be obtained by the spatial attention mechanism. Then, the generic inhomogeneous partial differential equation (PDE) of physical dynamic systems is used to construct the dynamic higher-order spatio-temporal GNN to obtain the missing time series values. Moreover, we estimate the missing impact by Normalizing Flows (NF) to evaluate the importance of each node in the graph for better explainability. Experimental results on four benchmark datasets demonstrate the effectiveness of HSPGNN and the superior performance when combining various order neighbor nodes. Also, graph-like optical flow, dynamic graphs, and missing impact can be obtained naturally by HSPGNN, which provides better dynamic analysis and explanation than traditional data-driven models. Our code is available at https://github.com/gorgen2020/HSPGNN.
Deep Learning for Multivariate Time Series Imputation: A Survey
Wang, Jun, Du, Wenjie, Cao, Wei, Zhang, Keli, Wang, Wenjia, Liang, Yuxuan, Wen, Qingsong
The ubiquitous missing values cause the multivariate time series data to be partially observed, destroying the integrity of time series and hindering the effective time series data analysis. Recently deep learning imputation methods have demonstrated remarkable success in elevating the quality of corrupted time series data, subsequently enhancing performance in downstream tasks. In this paper, we conduct a comprehensive survey on the recently proposed deep learning imputation methods. First, we propose a taxonomy for the reviewed methods, and then provide a structured review of these methods by highlighting their strengths and limitations. We also conduct empirical experiments to study different methods and compare their enhancement for downstream tasks. Finally, the open issues for future research on multivariate time series imputation are pointed out. All code and configurations of this work, including a regularly maintained multivariate time series imputation paper list, can be found in the GitHub repository~\url{https://github.com/WenjieDu/Awesome\_Imputation}.
- Asia > China > Hong Kong (0.04)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
GATGPT: A Pre-trained Large Language Model with Graph Attention Network for Spatiotemporal Imputation
Chen, Yakun, Wang, Xianzhi, Xu, Guandong
The presence of multivariate time series data is extensively documented across a variety of sectors including economics, transportation, healthcare, and meteorology, as evidenced in several studies [1, 2, 3, 4]. A range of statistical and machine learning techniques have been shown to perform effectively on complete datasets in several time series tasks, including forecasting [5], classification [6], and anomaly detection [7]. However, it is often observed that multivariate time series data collected from real-world scenarios are prone to missing values due to various factors, such as sensor malfunctions and data transmission errors. These missing values can considerably affect the quality of the data, subsequently impacting the effectiveness of the aforementioned methods in their respective tasks. Extensive research efforts have been dedicated to addressing the challenges in spatiotemporal imputation. A typical approach involves the development of a distinct framework for initially estimating missing values, followed by the application of the completed dataset in another sophisticated framework for subsequent operations like forecasting, classification, and anomaly detection. To fill in missing values, various statistical and machine learning techniques are applied.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- (3 more...)
- Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Multivariate Time Series Imputation by Graph Neural Networks
Cini, Andrea, Marisca, Ivan, Alippi, Cesare
Dealing with missing values and incomplete time series is a labor-intensive and time-consuming inevitable task when handling data coming from real-world applications. Effective spatio-temporal representations would allow imputation methods to reconstruct missing temporal data by exploiting information coming from sensors at different locations. However, standard methods fall short in capturing the nonlinear time and space dependencies existing within networks of interconnected sensors and do not take full advantage of the available - and often strong - relational information. Notably, most of state-of-the-art imputation methods based on deep learning do not explicitly model relational aspects and, in any case, do not exploit processing frameworks able to adequately represent structured spatio-temporal data. Conversely, graph neural networks have recently surged in popularity as both expressive and scalable tools for processing sequential data with relational inductive biases. In this work, we present the first assessment of graph neural networks in the context of multivariate time series imputation. In particular, we introduce a novel graph neural network architecture, named GRIL, which aims at reconstructing missing data in the different channels of a multivariate time series by learning spatial-temporal representations through message passing. Preliminary empirical results show that our model outperforms state-of-the-art methods in the imputation task on relevant benchmarks with mean absolute error improvements often higher than 20%.
- Europe > Switzerland (0.04)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (7 more...)
- Energy (0.46)
- Health & Medicine (0.46)
Multivariate Time Series Imputation with Generative Adversarial Networks
Luo, Yonghong, Cai, Xiangrui, ZHANG, Ying, Xu, Jun, xiaojie, Yuan
Multivariate time series usually contain a large number of missing values, which hinders the application of advanced analysis methods on multivariate time series data. Conventional approaches to addressing the challenge of missing values, including mean/zero imputation, case deletion, and matrix factorization-based imputation, are all incapable of modeling the temporal dependencies and the nature of complex distribution in multivariate time series. In this paper, we treat the problem of missing value imputation as data generation. Inspired by the success of Generative Adversarial Networks (GAN) in image generation, we propose to learn the overall distribution of a multivariate time series dataset with GAN, which is further used to generate the missing values for each sample. Different from the image data, the time series data are usually incomplete due to the nature of data recording process.