FaST: Efficient and Effective Long-Horizon Forecasting for Large-Scale Spatial-Temporal Graphs via Mixture-of-Experts

📅 2026-01-08

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenges of high computational cost, excessive memory consumption, and limited prediction accuracy faced by existing methods when performing long-horizon forecasting on large-scale spatiotemporal graphs. To this end, we propose FaST, a novel framework that introduces an adaptive graph agent attention mechanism to substantially reduce computational complexity. Furthermore, FaST incorporates a parallel Mixture-of-Experts (MoE) module based on Gated Linear Units (GLUs) to enhance model scalability and efficiency. The proposed approach enables efficient forecasting on graphs with thousands of nodes over horizons as long as 672 time steps (one week). Extensive experiments on multiple real-world datasets demonstrate that FaST significantly outperforms state-of-the-art methods, achieving high prediction accuracy while markedly reducing both computational and memory overhead.

Technology Category

Application Category

📝 Abstract

Spatial-Temporal Graph (STG) forecasting on large-scale networks has garnered significant attention. However, existing models predominantly focus on short-horizon predictions and suffer from notorious computational costs and memory consumption when scaling to long-horizon predictions and large graphs. Targeting the above challenges, we present FaST, an effective and efficient framework based on heterogeneity-aware Mixture-of-Experts (MoEs) for long-horizon and large-scale STG forecasting, which unlocks one-week-ahead (672 steps at a 15-minute granularity) prediction with thousands of nodes. FaST is underpinned by two key innovations. First, an adaptive graph agent attention mechanism is proposed to alleviate the computational burden inherent in conventional graph convolution and self-attention modules when applied to large-scale graphs. Second, we propose a new parallel MoE module that replaces traditional feed-forward networks with Gated Linear Units (GLUs), enabling an efficient and scalable parallel structure. Extensive experiments on real-world datasets demonstrate that FaST not only delivers superior long-horizon predictive accuracy but also achieves remarkable computational efficiency compared to state-of-the-art baselines. Our source code is available at: https://github.com/yijizhao/FaST.

Problem

Research questions and friction points this paper is trying to address.

long-horizon forecasting

spatial-temporal graph

computational efficiency

large-scale networks

memory consumption

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts

Spatial-Temporal Graph

Long-Horizon Forecasting

Graph Attention

Gated Linear Units

🔎 Similar Papers

No similar papers found.

Authors to Follow