🤖 AI Summary
This study addresses the computational inefficiency in time series language models arising from the unified treatment of time series and prompt tokens, which exhibit fundamentally different information structures. The work reveals an asymmetry in token importance: time series tokens contribute unevenly across the frequency spectrum, while the influence of prompt tokens diminishes with model depth. Leveraging this insight, the authors propose an adaptive token compression framework that dynamically compresses time series tokens via spectral analysis and progressively prunes prompt tokens in deeper layers, enabling hierarchical, frequency-aware, non-uniform budget allocation. Evaluated across forecasting, classification, imputation, and anomaly detection tasks, the method achieves up to 7.68× speedup and improves performance in 78% of experimental settings.
📝 Abstract
Large language models (LLMs) have enabled time series (TS) analysis by jointly modeling numerical observations and textual context through a shared token interface. However, TS tokens and prompt tokens exhibit fundamentally different information structures, making uniform token processing inefficient. In this paper, we study token efficiency in TS language modeling from an asymmetric-token perspective. We show that TS tokens have highly uneven spectral contributions, where many tokens share redundant frequency patterns while a small subset preserves critical temporal evidence. We also observe that prompt-token influence attenuates with model depth, suggesting that full prompt retention across all layers is unnecessary. Based on these findings, we develop an adaptive token budgeting framework that compresses TS tokens via frequency-domain structure and progressively reduces prompt tokens across layers. Experiments across forecasting, classification, imputation, and anomaly detection demonstrate up to \textit{\textbf{7.68$\times$}} inference acceleration and performance gains in \textit{\textbf{78\%}} of evaluated settings, showing the effectiveness of asymmetric token compression for scalable TS foundation models.