Liu, Juncheng
Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models
Liu, Xu, Aksu, Taha, Liu, Juncheng, Wen, Qingsong, Liang, Yuxuan, Xiong, Caiming, Savarese, Silvio, Sahoo, Doyen, Li, Junnan, Liu, Chenghao
Time series analysis is crucial for understanding dynamics of complex systems. Recent advances in foundation models have led to task-agnostic Time Series Foundation Models (TSFMs) and Large Language Model-based Time Series Models (TSLLMs), enabling generalized learning and integrating contextual information. However, their success depends on large, diverse, and high-quality datasets, which are challenging to build due to regulatory, diversity, quality, and quantity constraints. Synthetic data emerge as a viable solution, addressing these challenges by offering scalable, unbiased, and high-quality alternatives. This survey provides a comprehensive review of synthetic data for TSFMs and TSLLMs, analyzing data generation strategies, their role in model pretraining, fine-tuning, and evaluation, and identifying future research directions.
GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation
Aksu, Taha, Woo, Gerald, Liu, Juncheng, Liu, Xu, Liu, Chenghao, Savarese, Silvio, Xiong, Caiming, Sahoo, Doyen
Time series foundation models excel in zero-shot forecasting, handling diverse tasks without explicit training. However, the advancement of these models has been hindered by the lack of comprehensive benchmarks. To address this gap, we introduce the General TIme Series ForecasTing Model Evaluation, GIFT-Eval, a pioneering benchmark aimed at promoting evaluation across diverse datasets. GIFT-Eval encompasses 23 datasets over 144,000 time series and 177 million data points, spanning seven domains, 10 frequencies, multivariate inputs, and prediction lengths ranging from short to long-term forecasts. To facilitate the effective pretraining and evaluation of foundation models, we also provide a non-leaking pretraining dataset containing approximately 230 billion data points. Additionally, we provide a comprehensive analysis of 17 baselines, which includes statistical models, deep learning models, and foundation models. We discuss each model in the context of various benchmark characteristics and offer a qualitative analysis that spans both deep learning and foundation models. We believe the insights from this analysis, along with access to this new standard zero-shot time series forecasting benchmark, will guide future developments in time series foundation models. The success of foundation model pretraining in language and vision modalities has catalyzed similar progress in time series forecasting. By pretraining on extensive time series datasets, a universal forecasting model can be developed, equipped to address varied downstream forecasting tasks across multiple domains, frequencies, prediction lengths, and number of variates in a zero-shot manner (Woo et al., 2024; Rasul et al., 2023; Ansari et al., 2024). A critical aspect of foundation model research is creating a high-quality benchmark that includes large, diverse evaluation data, and preferably non-leaking pretraining data to fairly evaluate models and identify their weaknesses. Research in Natural Language Processing (NLP) has produced key benchmarks such as GLUE, MMLU, etc. (Wang et al., 2018; Hendrycks et al., 2020; Srivastava et al., 2022; Chen et al., 2021), which are crucial for developing high-quality models. Unlike NLP, time series foundation models lack a unified, diverse benchmark for fair comparison. For instance, Woo et al. (2024) introduces LOTSA, which remains the largest collection of time series forecasting pre-training data to date. However, the proposed architecture, Moirai, is evaluated on existing benchmarks that are tailored to specific forecasting tasks, such as the LSF (Zhou et al., 2020) dataset for long-term forecast, and the Monash (Godahewa et al., 2021) dataset for univariate forecasts.
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts
Liu, Xu, Liu, Juncheng, Woo, Gerald, Aksu, Taha, Liang, Yuxuan, Zimmermann, Roger, Liu, Chenghao, Savarese, Silvio, Xiong, Caiming, Sahoo, Doyen
Time series foundation models have demonstrated impressive performance as zero-shot forecasters. However, achieving effectively unified training on time series remains an open challenge. Existing approaches introduce some level of model specialization to account for the highly heterogeneous nature of time series data. For instance, Moirai pursues unified training by employing multiple input/output projection layers, each tailored to handle time series at a specific frequency. Similarly, TimesFM maintains a frequency embedding dictionary for this purpose. We identify two major drawbacks to this human-imposed frequency-level model specialization: (1) Frequency is not a reliable indicator of the underlying patterns in time series. For example, time series with different frequencies can display similar patterns, while those with the same frequency may exhibit varied patterns. (2) Non-stationarity is an inherent property of real-world time series, leading to varied distributions even within a short context window of a single time series. Frequency-level specialization is too coarse-grained to capture this level of diversity. To address these limitations, this paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts (MoE) within Transformers. With these designs, Moirai-MoE reduces reliance on human-defined heuristics and enables automatic token-level specialization. Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios. Furthermore, this study conducts comprehensive model analyses to explore the inner workings of time series MoE foundation models and provides valuable insights for future research.
UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting
Liu, Juncheng, Liu, Chenghao, Woo, Gerald, Wang, Yiwei, Hooi, Bryan, Xiong, Caiming, Sahoo, Doyen
Transformer-based models have emerged as powerful tools for multivariate time series forecasting (MTSF). However, existing Transformer models often fall short of capturing both intricate dependencies across variate and temporal dimensions in MTS data. Some recent models are proposed to separately capture variate and temporal dependencies through either two sequential or parallel attention mechanisms. However, these methods cannot directly and explicitly learn the intricate inter-series and intra-series dependencies. In this work, we first demonstrate that these dependencies are very important as they usually exist in real-world data. To directly model these dependencies, we propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens. Additionally, we add a dispatcher module which reduces the complexity and makes the model feasible for a potentially large number of variates. Although our proposed model employs a simple architecture, it offers compelling performance as shown in our extensive experiments on several datasets for time series forecasting.
A Fusion-Denoising Attack on InstaHide with Data Augmentation
Luo, Xinjian, Xiao, Xiaokui, Wu, Yuncheng, Liu, Juncheng, Ooi, Beng Chin
InstaHide is a state-of-the-art mechanism for protecting private training images, by mixing multiple private images and modifying them such that their visual features are indistinguishable to the naked eye. In recent work, however, Carlini et al. show that it is possible to reconstruct private images from the encrypted dataset generated by InstaHide. Nevertheless, we demonstrate that Carlini et al.'s attack can be easily defeated by incorporating data augmentation into InstaHide. This leads to a natural question: is InstaHide with data augmentation secure? In this paper, we provide a negative answer to this question, by devising an attack for recovering private images from the outputs of InstaHide even when data augmentation is present. The basic idea is to use a comparative network to identify encrypted images that are likely to correspond to the same private image, and then employ a fusion-denoising network for restoring the private image from the encrypted ones, taking into account the effects of data augmentation. Extensive experiments demonstrate the effectiveness of the proposed attack in comparison to Carlini et al.'s attack.
OneFlow: Redesign the Distributed Deep Learning Framework from Scratch
Yuan, Jinhui, Li, Xinqi, Cheng, Cheng, Liu, Juncheng, Guo, Ran, Cai, Shenghang, Yao, Chi, Yang, Fei, Yi, Xiaodong, Wu, Chuan, Zhang, Haoran, Zhao, Jie
Deep learning frameworks such as TensorFlow and PyTorch provide a productive interface for expressing and training a deep neural network (DNN) model on a single device or using data parallelism. Still, they may not be flexible or efficient enough in training emerging large models on distributed devices, which require more sophisticated parallelism beyond data parallelism. Plugins or wrappers have been developed to strengthen these frameworks for model or pipeline parallelism, but they complicate the usage and implementation of distributed deep learning. Aiming at a simple, neat redesign of distributed deep learning frameworks for various parallelism paradigms, we present OneFlow, a novel distributed training framework based on an SBP (split, broadcast and partial-value) abstraction and the actor model. SBP enables much easier programming of data parallelism and model parallelism than existing frameworks, and the actor model provides a succinct runtime mechanism to manage the complex dependencies imposed by resource constraints, data movement and computation in distributed deep learning. We demonstrate the general applicability and efficiency of OneFlow for training various large DNN models with case studies and extensive experiments. The results show that OneFlow outperforms many well-known customized libraries built on top of the state-of-the-art frameworks. The code of OneFlow is available at: https://github.com/Oneflow-Inc/oneflow.