Conv-like Scale-Fusion Time Series Transformer: A Multi-Scale Representation for Variable-Length Long Time Series

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Time-series modeling faces challenges including variable-length sequence handling, high feature redundancy, and limited generalization capability. To address these, we propose ConvFormer—a convolution-like multi-scale fusion framework that jointly employs temporal patching and multi-head attention to progressively compress the time dimension while expanding channel capacity. It introduces cross-scale attention and logarithmic-space normalization to enhance multi-scale feature interaction and suppress redundant representations. The resulting hierarchical time-series representation achieves significant improvements over state-of-the-art Transformer- and CNN-based baselines on both forecasting and classification tasks, reducing feature redundancy by 12.6%–28.4% and improving average performance by 3.7%–9.2%. Our core contribution lies in the first unified integration of convolution-like structural inductive bias, cross-scale attention, and logarithmic normalization within a Transformer architecture—effectively balancing local pattern modeling with global dependency capture.

Technology Category

Application Category

📝 Abstract

Time series analysis faces significant challenges in handling variable-length data and achieving robust generalization. While Transformer-based models have advanced time series tasks, they often struggle with feature redundancy and limited generalization capabilities. Drawing inspiration from classical CNN architectures' pyramidal structure, we propose a Multi-Scale Representation Learning Framework based on a Conv-like ScaleFusion Transformer. Our approach introduces a temporal convolution-like structure that combines patching operations with multi-head attention, enabling progressive temporal dimension compression and feature channel expansion. We further develop a novel cross-scale attention mechanism for effective feature fusion across different temporal scales, along with a log-space normalization method for variable-length sequences. Extensive experiments demonstrate that our framework achieves superior feature independence, reduced redundancy, and better performance in forecasting and classification tasks compared to state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Handling variable-length time series data with robust generalization capabilities

Addressing feature redundancy and limited generalization in Transformer models

Developing multi-scale representation learning for long time series analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conv-like Transformer with pyramidal multi-scale representation

Cross-scale attention mechanism for temporal feature fusion

Log-space normalization method for variable-length sequences

🔎 Similar Papers

No similar papers found.

Authors to Follow