S2TX: Cross-Attention Multi-Scale State-Space Transformer for Time Series Forecasting

Wu, Zihao, Dong, Juncheng, Yang, Haoming, Tarokh, Vahid

Feb-16-2025–arXiv.org Artificial Intelligence

Time series forecasting has recently achieved significant progress with multi-scale models to address the heterogeneity between long and short range patterns. Despite their state-of-the-art performance, we identify two potential areas for improvement. First, the variates of the multivariate time series are processed independently. Moreover, the multi-scale (long and short range) representations are learned separately by two independent models without communication. In light of these concerns, we propose State Space Transformer with cross-attention (S2TX). S2TX employs a cross-attention mechanism to integrate a Mamba model for extracting long-range cross-variate context and a Transformer model with local window attention to capture short-range representations. By cross-attending to the global context, the Transformer model further facilitates variate-level interactions as well as local/global communications. Comprehensive experiments on seven classic long-short range time-series forecasting benchmark datasets demonstrate that S2TX can achieve highly robust SOTA results while maintaining a low memory footprint.

data mining, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Feb-16-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.28)
- North America (0.46)

Genre:
- Research Report (0.64)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)
    - Natural Language (1.00)
    - Representation & Reasoning (0.93)
    - Vision (0.68)
  - Data Science > Data Mining (0.93)