A Related Work

Neural Information Processing Systems 

A.1 Time Series Forecasting We first briefly review the related literature of time series forecasting (TSF) methods as below. Complex temporal patterns can be manifested over short-and long-term as the time series evolves across time. To leverage the time evolution nature, existing statistical models, such as ARIMA [6] and Gaussian process regression [7] have been well established and applied to many downstream tasks [28, 29, 2]. Recurrent neural network (RNN) models are also introduced to model temporal dependencies for TSF in a sequence-to-sequence paradigm [24, 9, 61, 40, 46, 50, 53]. Besides, temporal attention [49, 59, 56] and causal convolution [3, 5, 54] are further explored to model the intrinsic temporal dependencies.