🤖 AI Summary
Existing Mamba-based models struggle to explicitly model cross-variable interactions, disentangle temporal dynamics from variable dependencies, and account for time-lag effects in multivariate time series analysis, thereby limiting their performance. To address these limitations, this work proposes DeMa, a dual-path delay-aware Mamba architecture that decomposes the input sequence into temporal and variable pathways. The temporal path employs a Mamba-SSD module to capture long-term univariate dynamics, while the variable path introduces a Mamba-DALA module that leverages delay-aware linear attention to model cross-variable dependencies with explicit time-lag consideration. Maintaining linear computational complexity, DeMa achieves state-of-the-art performance across five diverse tasks—long- and short-term forecasting, imputation, anomaly detection, and classification—while significantly improving computational efficiency.
📝 Abstract
Accurate and efficient multivariate time series (MTS) analysis is increasingly critical for a wide range of intelligent applications. Within this realm, Transformers have emerged as the predominant architecture due to their strong ability to capture pairwise dependencies. However, Transformer-based models suffer from quadratic computational complexity and high memory overhead, limiting their scalability and practical deployment in long-term and large-scale MTS modeling. Recently, Mamba has emerged as a promising linear-time alternative with high expressiveness. Nevertheless, directly applying vanilla Mamba to MTS remains suboptimal due to three key limitations: (i) the lack of explicit cross-variate modeling, (ii) difficulty in disentangling the entangled intra-series temporal dynamics and inter-series interactions, and (iii) insufficient modeling of latent time-lag interaction effects. These issues constrain its effectiveness across diverse MTS tasks. To address these challenges, we propose DeMa, a dual-path delay-aware Mamba backbone. DeMa preserves Mamba's linear-complexity advantage while substantially improving its suitability for MTS settings. Specifically, DeMa introduces three key innovations: (i) it decomposes the MTS into intra-series temporal dynamics and inter-series interactions; (ii) it develops a temporal path with a Mamba-SSD module to capture long-range dynamics within each individual series, enabling series-independent, parallel computation; and (iii) it designs a variate path with a Mamba-DALA module that integrates delay-aware linear attention to model cross-variate dependencies. Extensive experiments on five representative tasks, long- and short-term forecasting, data imputation, anomaly detection, and series classification, demonstrate that DeMa achieves state-of-the-art performance while delivering remarkable computational efficiency.