🤖 AI Summary
To address semantic distortion and insufficient robustness in multivariate time series forecasting (MTSF) under dynamic missingness rates, this paper proposes Merlin, a multi-view representation learning framework. Merlin employs offline knowledge distillation to align semantic representations between incomplete and complete observations, introducing the first semantic alignment mechanism explicitly designed for dynamic missingness rates. It is the first to integrate multi-view contrastive learning with knowledge distillation, thereby enhancing generalization to unseen missingness patterns. The framework further incorporates missingness-aware data augmentation and temporal embedding modeling to jointly improve prediction accuracy and robustness. Evaluated on four real-world datasets, Merlin maintains stable performance under high and varying missingness rates, consistently outperforming state-of-the-art MTSF methods across all benchmarks.
📝 Abstract
Multivariate Time Series Forecasting (MTSF) involves predicting future values of multiple interrelated time series. Recently, deep learning-based MTSF models have gained significant attention for their promising ability to mine semantics (global and local information) within MTS data. However, these models are pervasively susceptible to missing values caused by malfunctioning data collectors. These missing values not only disrupt the semantics of MTS, but their distribution also changes over time. Nevertheless, existing models lack robustness to such issues, leading to suboptimal forecasting performance. To this end, in this paper, we propose Multi-View Representation Learning (Merlin), which can help existing models achieve semantic alignment between incomplete observations with different missing rates and complete observations in MTS. Specifically, Merlin consists of two key modules: offline knowledge distillation and multi-view contrastive learning. The former utilizes a teacher model to guide a student model in mining semantics from incomplete observations, similar to those obtainable from complete observations. The latter improves the student model's robustness by learning from positive/negative data pairs constructed from incomplete observations with different missing rates, ensuring semantic alignment across different missing rates. Therefore, Merlin is capable of effectively enhancing the robustness of existing models against unfixed missing rates while preserving forecasting accuracy. Experiments on four real-world datasets demonstrate the superiority of Merlin.