Representing Time Series as Structured Programs for LLM Reasoning

📅 2026-06-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Large language models (LLMs) struggle to efficiently process time series data due to its non-textual modality, and existing approaches suffer from degraded performance and high computational costs on long sequences. This work proposes T2SP, a training-free, deterministic method that, for the first time, represents time series as structured symbolic programs explicitly encoding trends, periodicity, and salient events, thereby transforming them into code—a modality natively amenable to LLMs. By shifting time series structural extraction from the model side to the representation layer, T2SP effectively mitigates modality mismatch. Experimental results demonstrate that, compared to raw string representations, T2SP substantially improves performance, reduces inference time, and lowers failure rates across editing, description generation, and question-answering tasks.

📝 Abstract

Large language models (LLMs) have demonstrated strong reasoning and instruction-following capabilities, making them potentially powerful tools for time-series analysis. However, time series lie outside their native textual modality, raising a fundamental question: how should time series be represented so that LLMs can reason about them effectively? Existing work typically serializes raw numerical sequences or fine-tunes pre-trained LLMs on time-series data. These approaches place the burden of extracting temporal structure directly on the LLM, creating a modality mismatch that often degrades performance on long sequences and introduces substantial computational overhead. In this work, we introduce Time-Series-to-Structured-Program representation (T2SP), a deterministic, training-free method that represents a time series as a structured symbolic program. T2SP decomposes time series into trends, periods, and salient events, expressing them in a program-friendly format aligned with the textual and code-like modalities on which LLMs are natively trained. By shifting temporal-structure extraction from the model to the representation itself, T2SP enables off-the-shelf LLMs to leverage their existing reasoning capabilities for time-series understanding. We evaluate T2SP on three reasoning tasks -- editing, captioning, and question answering -- where it consistently improves performance, reduces reasoning time, and lowers failure rates compared with raw-string representations. Our results demonstrate that T2SP provides an effective interface between time series and LLMs.

Problem

Research questions and friction points this paper is trying to address.

time series representation

large language models

modality mismatch

temporal structure

structured programs

Innovation

Methods, ideas, or system contributions that make the work stand out.

structured program representation

time series reasoning

large language models