🤖 AI Summary
Time-series pretraining faces core challenges in modeling dynamic temporal dependencies, mitigating distribution shifts, and eliminating spurious correlations—leading to degraded generalization. To address these, we propose DeCoP (Dependency-Controlled Pretraining), a novel framework that introduces instance-level block normalization and hierarchical dependency control learning. DeCoP explicitly models dynamic cross-scale block dependencies, decouples short- and long-term pattern interactions, and suppresses spurious correlations. Our method integrates instance-level contrastive learning, multi-scale dependency modeling, and hierarchical contrastive strategies to achieve robust representation learning. Evaluated on ten benchmark datasets, DeCoP consistently outperforms state-of-the-art methods: on ETTh1, it reduces MSE by 3% over PatchTST while cutting FLOPs by 63% (requiring only 37% of PatchTST’s computational cost), demonstrating significant improvements in both accuracy and efficiency.
📝 Abstract
Modeling dynamic temporal dependencies is a critical challenge in time series pre-training, which evolve due to distribution shifts and multi-scale patterns. This temporal variability severely impairs the generalization of pre-trained models to downstream tasks. Existing frameworks fail to capture the complex interactions of short- and long-term dependencies, making them susceptible to spurious correlations that degrade generalization. To address these limitations, we propose DeCoP, a Dependency Controlled Pre-training framework that explicitly models dynamic, multi-scale dependencies by simulating evolving inter-patch dependencies. At the input level, DeCoP introduces Instance-wise Patch Normalization (IPN) to mitigate distributional shifts while preserving the unique characteristics of each patch, creating a robust foundation for representation learning. At the latent level, a hierarchical Dependency Controlled Learning (DCL) strategy explicitly models inter-patch dependencies across multiple temporal scales, with an Instance-level Contrastive Module (ICM) enhances global generalization by learning instance-discriminative representations from time-invariant positive pairs. DeCoP achieves state-of-the-art results on ten datasets with lower computing resources, improving MSE by 3% on ETTh1 over PatchTST using only 37% of the FLOPs.