Diffusion Auto-regressive Transformer for Effective Self-supervised Time Series Forecasting

Wang, Daoyu, Cheng, Mingyue, Liu, Zhiding, Liu, Qi, Chen, Enhong

arXiv.org Artificial Intelligence 

Self-supervised learning has become a popular and effective approach for enhancing time series forecasting, enabling models to learn universal representations from unlabeled data. However, effectively capturing both the global sequence dependence and local detail features within time series data remains challenging. To address this, we propose a novel generative self-supervised method called TimeDART, denoting Diffusion Auto-regressive Transformer for Time series forecasting. In TimeDART, we treat time series patches as basic modeling units. Specifically, we employ an self-attention based Transformer encoder to model the dependencies of inter-patches. Additionally, we introduce diffusion and denoising mechanisms to capture the detail locality features of intra-patch. Notably, we design a cross-attention-based denoising decoder that allows for adjustable optimization difficulty in the self-supervised task, facilitating more effective self-supervised pre-training. Furthermore, the entire model is optimized in an auto-regressive manner to obtain transferable representations. Extensive experiments demonstrate that TimeDART achieves state-of-the-art fine-tuning performance compared to the most advanced competitive methods in forecasting tasks. Time series forecasting (Harvey, 1990; Hamilton, 2020; Box et al., 2015; Cheng et al., 2024b) is crucial in a wide array of domains, including finance (Black & Scholes, 1973), healthcare (Cheng et al., 2024c), energy management (Zhou et al., 2024). Accurate predictions of future data points could enable better decision-making, resource allocation, and risk management, ultimately leading to significant operational improvements and strategic advantages. Among the various methods developed for time series forecasting (Miller et al., 2024), deep neural networks (Ding et al., 2024; Jin et al., 2023; Cao et al., 2023; Cheng et al., 2024b) have emerged as a popular and effective solution paradigm. To further enhance the performance of time series forecasting, self-supervised learning has become an increasingly popular research paradigm (Nie et al., 2022).