🤖 AI Summary
Multivariate time series forecasting faces significant challenges in simultaneously capturing short-term fluctuations and long-term dependencies, as well as balancing accuracy, efficiency, and robustness. This work proposes a novel multi-resolution Transformer architecture that uniquely integrates point-wise and patch-wise embeddings to construct a multi-granular temporal representation. By synergistically combining fine-grained detail preservation with enhanced contextual modeling, the proposed approach improves both predictive performance and computational efficiency. Extensive experiments demonstrate that the method consistently outperforms single-representation baselines across seven benchmark datasets, achieving notable gains in forecasting accuracy, noise robustness, and generalization capability across varying prediction horizons.
📝 Abstract
Accurate forecasting of multivariate time series remains challenging due to the need to capture both short-term fluctuations and long-range temporal dependencies. Transformer-based models have emerged as a powerful approach, but their performance depends critically on the representation of temporal data. Traditional point-wise representations preserve individual time-step information, enabling fine-grained modeling, yet they tend to be computationally expensive and less effective at modeling broader contextual dependencies, limiting their scalability to long sequences. Patch-wise representations aggregate consecutive steps into compact tokens to improve efficiency and model local temporal dynamics, but they often discard fine-grained temporal details that are critical for accurate predictions in volatile or complex time series. We propose IPatch, a multi-resolution Transformer architecture that integrates both point-wise and patch-wise tokens, modeling temporal information at multiple resolutions. Experiments on 7 benchmark datasets demonstrate that IPatch consistently improves forecasting accuracy, robustness to noise, and generalization across various prediction horizons compared to single-representation baselines.