poly-mamba
A SSM is Polymerized from Multivariate Time Series
State space models (SSMs) [1] [15] are subquadratic-time foundational architectures compared with Transformers [2], and shows great performance with approximately linear complexity on long-range dependency tasks. Previous studies [3] [4] [5] attempted to employ SSM for Multivariate Time Series Forecasting (MTSF), they all follow the Transformer-based MTSF modeling paradigm: learning dependencies between temporal tokens [6] [7] [8] [9], Channel tokens [10] and their concatenation [11]. However, the special complex dependency pattern of MTS is the Channel Dependency variations with Time (CDT), none of these methods explicitly depict it. It is inappropriate to directly model the CDT because it not only greatly increases complexity when calculating the dependency between temporal tokens of all channels but is also hard to generalize for the scale of most MTS data. We delved deep into the initial development of SSM [12]: real-time approximation of continuously updating function by orthogonal function basis [13], and we found that compared with Transformers, SSM has the potential to efficiently and effectively depict the CDT pattern.