HDT: Hierarchical Discrete Transformer for Multivariate Time Series Forecasting

📅 2025-02-12

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the poor performance and limited prediction horizon of generative models in long-horizon forecasting of high-dimensional multivariate time series. We propose a Hierarchical Discrete Transformer (HDT) framework featuring a novel hierarchical discrete modeling mechanism: at the bottom level, ℓ²-normalized vector quantization (VQ) maps raw sequences into discrete trend tokens to explicitly capture long-range temporal dynamics; at the top level, a conditional autoregressive transformer generates fine-grained target tokens conditioned on these trend tokens. By integrating ℓ²-normalized token embeddings with a hierarchical Transformer architecture, HDT ensures training stability while enhancing representation efficiency for high-dimensional sequences. Extensive experiments on five benchmark multivariate datasets demonstrate that HDT significantly outperforms existing generative forecasting methods—particularly excelling in high-dimensional, long-horizon settings—with higher accuracy, faster inference, and superior scalability.

Technology Category

Application Category

📝 Abstract

Generative models have gained significant attention in multivariate time series forecasting (MTS), particularly due to their ability to generate high-fidelity samples. Forecasting the probability distribution of multivariate time series is a challenging yet practical task. Although some recent attempts have been made to handle this task, two major challenges persist: 1) some existing generative methods underperform in high-dimensional multivariate time series forecasting, which is hard to scale to higher dimensions; 2) the inherent high-dimensional multivariate attributes constrain the forecasting lengths of existing generative models. In this paper, we point out that discrete token representations can model high-dimensional MTS with faster inference time, and forecasting the target with long-term trends of itself can extend the forecasting length with high accuracy. Motivated by this, we propose a vector quantized framework called Hierarchical Discrete Transformer (HDT) that models time series into discrete token representations with l2 normalization enhanced vector quantized strategy, in which we transform the MTS forecasting into discrete tokens generation. To address the limitations of generative models in long-term forecasting, we propose a hierarchical discrete Transformer. This model captures the discrete long-term trend of the target at the low level and leverages this trend as a condition to generate the discrete representation of the target at the high level that introduces the features of the target itself to extend the forecasting length in high-dimensional MTS. Extensive experiments on five popular MTS datasets verify the effectiveness of our proposed method.

Problem

Research questions and friction points this paper is trying to address.

Improves high-dimensional multivariate time series forecasting.

Extends forecasting length with discrete token representations.

Enhances accuracy in long-term trend predictions.

Innovation

Methods, ideas, or system contributions that make the work stand out.

HDT models time series discretely

Uses l2 normalized vector quantization

Hierarchical Transformer enhances forecasting length

🔎 Similar Papers

Optimizing Time Series Forecasting Architectures: A Hierarchical Neural Architecture Search Approach