WDformer: A Wavelet-based Differential Transformer Model for Time Series Forecasting

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient time-frequency information utilization and susceptibility of conventional attention mechanisms to historical noise in time-series forecasting, this paper proposes the Wavelet Differential Transformer (WaveDiff-Transformer). The model integrates wavelet transform for multi-resolution joint time-frequency modeling; introduces inverted-dimension attention to enhance inter-channel dependency learning; and proposes a differential attention mechanism—employing differenced softmax to suppress irrelevant historical responses while amplifying selection of critical time-frequency features. By synergistically combining wavelet analysis, multi-head attention, and differentiable attention computation, WaveDiff-Transformer achieves state-of-the-art performance across multiple multivariate time-series benchmark datasets, significantly improving both prediction accuracy and robustness. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Time series forecasting has various applications, such as meteorological rainfall prediction, traffic flow analysis, financial forecasting, and operational load monitoring for various systems. Due to the sparsity of time series data, relying solely on time-domain or frequency-domain modeling limits the model's ability to fully leverage multi-domain information. Moreover, when applied to time series forecasting tasks, traditional attention mechanisms tend to over-focus on irrelevant historical information, which may introduce noise into the prediction process, leading to biased results. We proposed WDformer, a wavelet-based differential Transformer model. This study employs the wavelet transform to conduct a multi-resolution analysis of time series data. By leveraging the advantages of joint representation in the time-frequency domain, it accurately extracts the key information components that reflect the essential characteristics of the data. Furthermore, we apply attention mechanisms on inverted dimensions, allowing the attention mechanism to capture relationships between multiple variables. When performing attention calculations, we introduced the differential attention mechanism, which computes the attention score by taking the difference between two separate softmax attention matrices. This approach enables the model to focus more on important information and reduce noise. WDformer has achieved state-of-the-art (SOTA) results on multiple challenging real-world datasets, demonstrating its accuracy and effectiveness. Code is available at https://github.com/xiaowangbc/WDformer.
Problem

Research questions and friction points this paper is trying to address.

Addresses time series forecasting limitations in multi-domain information utilization
Reduces noise from irrelevant historical information in attention mechanisms
Improves accuracy by capturing time-frequency characteristics and variable relationships
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wavelet transform enables multi-resolution time series analysis
Differential attention mechanism reduces noise in predictions
Inverted dimension attention captures multi-variable relationships
🔎 Similar Papers
No similar papers found.
Xiaojian Wang
Xiaojian Wang
Assistant Professor, University of Colorado Denver
Satellite NetworkEdge ComputingSecurityPayment Channel NetworkBlockchain
Chaoli Zhang
Chaoli Zhang
Zhejiang Normal University
Time Series AnalysisMachine LearningAlgorithmic Game Theory
Z
Zhonglong Zheng
School of Computer Science and Technology, Zhejiang Key Laboratory of Intelligent Education Technology and Application, Zhejiang Normal University
Y
Yunliang Jiang
Zhejiang Key Laboratory of Intelligent Education Technology and Application, School of Computer Science and Technology, Zhejiang Normal University