Enhancing LLMs for Time Series Forecasting via Structure-Guided Cross-Modal Alignment

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from limited cross-modal alignment efficacy and struggle to preserve sequence-level structural consistency in time-series forecasting. Method: This paper proposes a structure-guided cross-modal alignment framework that, for the first time, establishes sequence-level structural consistency as the core alignment principle. It introduces a shared state transition graph to jointly model temporal dynamics and linguistic representations, thereby transcending conventional token- or layer-level alignment paradigms. The method integrates a hidden Markov model (HMM), structure-aware cross-attention, and a probability-weighted semantic alignment mechanism. Contribution/Results: Evaluated on multiple benchmarks, the framework achieves state-of-the-art performance, significantly improving LLMs’ forecasting accuracy, generalization capability, and robustness—particularly in zero-shot and few-shot settings.

Technology Category

Application Category

📝 Abstract
The emerging paradigm of leveraging pretrained large language models (LLMs) for time series forecasting has predominantly employed linguistic-temporal modality alignment strategies through token-level or layer-wise feature mapping. However, these approaches fundamentally neglect a critical insight: the core competency of LLMs resides not merely in processing localized token features but in their inherent capacity to model holistic sequence structures. This paper posits that effective cross-modal alignment necessitates structural consistency at the sequence level. We propose the Structure-Guided Cross-Modal Alignment (SGCMA), a framework that fully exploits and aligns the state-transition graph structures shared by time-series and linguistic data as sequential modalities, thereby endowing time series with language-like properties and delivering stronger generalization after modality alignment. SGCMA consists of two key components, namely Structure Alignment and Semantic Alignment. In Structure Alignment, a state transition matrix is learned from text data through Hidden Markov Models (HMMs), and a shallow transformer-based Maximum Entropy Markov Model (MEMM) receives the hot-start transition matrix and annotates each temporal patch into state probability, ensuring that the temporal representation sequence inherits language-like sequential dynamics. In Semantic Alignment, cross-attention is applied between temporal patches and the top-k tokens within each state, and the ultimate temporal embeddings are derived by the expected value of these embeddings using a weighted average based on state probabilities. Experiments on multiple benchmarks demonstrate that SGCMA achieves state-of-the-art performance, offering a novel approach to cross-modal alignment in time series forecasting.
Problem

Research questions and friction points this paper is trying to address.

Aligns time series and linguistic data via structural consistency
Enhances LLMs for forecasting using state-transition graph structures
Proposes SGCMA for cross-modal alignment with language-like properties
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structure-Guided Cross-Modal Alignment framework
State transition matrix via HMMs and MEMM
Cross-attention for semantic alignment of modalities
🔎 Similar Papers
No similar papers found.