TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment

📅 2024-06-03
📈 Citations: 9
Influential: 2
📄 PDF
🤖 AI Summary
Existing multivariate time series (MTS) forecasting methods suffer from excessive parameter counts and heavy data requirements, while LLM-based approaches struggle to learn disentangled and robust temporal representations. To address these limitations, we propose TimeCMA, a cross-modal alignment framework featuring dual-branch encoders: a dedicated time-series encoder and an LLM-based textual prompt encoder. TimeCMA employs modality-similarity-driven embedding alignment to achieve disentangled time–semantic representation learning. Furthermore, we introduce the novel “tail-token focusing” strategy, which leverages only the final token’s embedding from the LLM’s output for prediction—significantly improving inference efficiency. Evaluated on eight real-world datasets, TimeCMA consistently outperforms state-of-the-art methods across all metrics, achieving both superior accuracy and low latency.

Technology Category

Application Category

📝 Abstract
Multivariate time series forecasting (MTSF) aims to learn temporal dynamics among variables to forecast future time series. Existing statistical and deep learning-based methods suffer from limited learnable parameters and small-scale training data. Recently, large language models (LLMs) combining time series with textual prompts have achieved promising performance in MTSF. However, we discovered that current LLM-based solutions fall short in learning disentangled embeddings. We introduce TimeCMA, an intuitive yet effective framework for MTSF via cross-modality alignment. Specifically, we present a dual-modality encoding with two branches: the time series encoding branch extracts disentangled yet weak time series embeddings, and the LLM-empowered encoding branch wraps the same time series with text as prompts to obtain entangled yet robust prompt embeddings. As a result, such a cross-modality alignment retrieves both disentangled and robust time series embeddings,"the best of two worlds", from the prompt embeddings based on time series and prompt modality similarities. As another key design, to reduce the computational costs from time series with their length textual prompts, we design an effective prompt to encourage the most essential temporal information to be encapsulated in the last token: only the last token is passed to downstream prediction. We further store the last token embeddings to accelerate inference speed. Extensive experiments on eight real datasets demonstrate that TimeCMA outperforms state-of-the-arts.
Problem

Research questions and friction points this paper is trying to address.

Improving multivariate time series forecasting accuracy
Enhancing disentangled embeddings in LLM-based solutions
Reducing computational costs with efficient prompt design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-modality encoding for cross-modality alignment
Effective prompt design for essential temporal information
Last token storage for accelerated inference
C
Chenxi Liu
S-Lab, Nanyang Technological University, Singapore
Q
Qianxiong Xu
S-Lab, Nanyang Technological University, Singapore
Hao Miao
Hao Miao
The Hong Kong Polytechnic University
Spatio-Temporal Data MiningTrajectory ManagementSpatial Crowdsourcing
S
Sun Yang
Peking University, China
L
Lingzheng Zhang
Hong Kong University of Science and Technology (Guangzhou), China
Cheng Long
Cheng Long
Nanyang Technological University
databasesmachine learningdata mining
Ziyue Li
Ziyue Li
CS PhD, University of Maryland
Machine learning
R
Rui Zhao
SenseTime Research, China