🤖 AI Summary
Transformer-based models for time-series forecasting suffer from high computational complexity and overfitting, while standard MLPs struggle to capture complex, multi-scale temporal patterns. Method: This paper proposes an MLP-based adaptive multi-scale decomposition framework. Its core innovation is the Multi-scale Decomposable Mixture (MDM) module—integrated with Dual-Dependency Interaction (DDI) and Adaptive Multi-Predictor Synthesis (AMS)—enabling, for the first time, scale-aware joint time-frequency modeling within a pure MLP architecture. By explicitly decomposing and jointly modeling multi-scale dynamics and their cross-scale dependencies, the method significantly enhances pattern representation capability. Contribution/Results: Evaluated on multiple benchmark datasets, the approach achieves state-of-the-art accuracy and efficiency, outperforming leading Transformer- and MLP-based models with substantially lower computational overhead.
📝 Abstract
Transformer-based and MLP-based methods have emerged as leading approaches in time series forecasting (TSF). However, real-world time series often show different patterns at different scales, and future changes are shaped by the interplay of these overlapping scales, requiring high-capacity models. While Transformer-based methods excel in capturing long-range dependencies, they suffer from high computational complexities and tend to overfit. Conversely, MLP-based methods offer computational efficiency and adeptness in modeling temporal dynamics, but they struggle with capturing temporal patterns with complex scales effectively. Based on the observation of multi-scale entanglement effect in time series, we propose a novel MLP-based Adaptive Multi-Scale Decomposition (AMD) framework for TSF. Our framework decomposes time series into distinct temporal patterns at multiple scales, leveraging the Multi-Scale Decomposable Mixing (MDM) block to dissect and aggregate these patterns. Complemented by the Dual Dependency Interaction (DDI) block and the Adaptive Multi-predictor Synthesis (AMS) block, our approach effectively models both temporal and channel dependencies and utilizes autocorrelation to refine multi-scale data integration. Comprehensive experiments demonstrate our AMD framework not only overcomes the limitations of existing methods but also consistently achieves state-of-the-art performance across various datasets.