Goto

Collaborating Authors

 temporal correlation


Introducing Spectral Attention for Long-Range Dependency in Time Series Forecasting

Neural Information Processing Systems

Sequence modeling faces challenges in capturing long-range dependencies across diverse tasks. Recent linear and transformer-based forecasters have shown superior performance in time series forecasting. However, they are constrained by their inherent inability to effectively address long-range dependencies in time series data, primarily due to using fixed-size inputs for prediction. Furthermore, they typically sacrifice essential temporal correlation among consecutive training samples by shuffling them into mini-batches. To overcome these limitations, we introduce a fast and effective Spectral Attention mechanism, which preserves temporal correlations among samples and facilitates the handling of long-range information while maintaining the base model structure. Spectral Attention preserves long-period trends through a low-pass filter and facilitates gradient to flow between samples. Spectral Attention can be seamlessly integrated into most sequence models, allowing models with fixed-sized look-back windows to capture long-range dependencies over thousands of steps. Through extensive experiments on 11 real-world time series datasets using 7 recent forecasting models, we consistently demonstrate the efficacy of our Spectral Attention mechanism, achieving state-of-the-art results.




We thank the reviewers for their careful reading of our work and for their helpful comments

Neural Information Processing Systems

We thank the reviewers for their careful reading of our work and for their helpful comments. We will also clarify that the text in sections 2.1 and 2.2 In terms of experimental predictions, our work predicts the synaptic weights in the SFA circuit. One mechanism for implementing a quadratic expansion are so-called "Sigma-Pi units" (Rumelhart, Hinton and (Mel and Koch, 1990). In this case, the derivation proceeds exactly as laid out in the paper. Thank you for pointing out the typos.



NOTE: Robust Continual Test-time Adaptation Against Temporal Correlation

Neural Information Processing Systems

Test-time adaptation (TTA) is an emerging paradigm that addresses distributional shifts between training and testing phases without additional data acquisition or labeling cost; only unlabeled test data streams are used for continual model adaptation. Previous TTA schemes assume that the test samples are independent and identically distributed (i.i.d.), even though they are often temporally correlated (non-i.i.d.) in application scenarios, e.g., autonomous driving. We discover that most existing TTA methods fail dramatically under such scenarios. Motivated by this, we present a new test-time adaptation scheme that is robust against non-i.i.d.


Capturing Complex Spatial-Temporal Dependencies in Traffic Forecasting: A Self-Attention Approach

arXiv.org Artificial Intelligence

We study the problem of traffic forecasting, aiming to predict the inflow and outflow of a region in the subsequent time slot. The problem is complex due to the intricate spatial and temporal interdependence among regions. Prior works study the spatial and temporal dependency in a decouple manner, failing to capture their joint effect. In this work, we propose ST-SAM, a novel and efficient Spatial-Temporal Self-Attention Model for traffic forecasting. ST-SAM uses a region embedding layer to learn time-specific embedding from traffic data for regions. Then, it employs a spatial-temporal dependency learning module based on self-attention mechanism to capture the joint spatial-temporal dependency for both nearby and faraway regions. ST-SAM entirely relies on self-attention to capture both local and global spatial-temporal correlations, which make it effective and efficient. Extensive experiments on two real world datasets show that ST-SAM is substantially more accurate and efficient than the state-of-the-art approaches (with an average improvement of up to 15% on RMSE, 17% on MAPE, and 32 times on training time in our experiments).


NOTE: Robust Continual Test-time Adaptation Against Temporal Correlation

Neural Information Processing Systems

Previous TT A schemes assume that the test samples are independent and identically distributed (i.i.d.), even though they are often temporally correlated (non-i.i.d.) in application scenarios, e.g., autonomous driving.



We thank the reviewers for their careful reading of our work and for their helpful comments

Neural Information Processing Systems

We thank the reviewers for their careful reading of our work and for their helpful comments. We will also clarify that the text in sections 2.1 and 2.2 In terms of experimental predictions, our work predicts the synaptic weights in the SFA circuit. One mechanism for implementing a quadratic expansion are so-called "Sigma-Pi units" (Rumelhart, Hinton and (Mel and Koch, 1990). In this case, the derivation proceeds exactly as laid out in the paper. Thank you for pointing out the typos.