HSTMixer: A Hierarchical MLP-Mixer for Large-Scale Traffic Forecasting

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing models for large-scale urban traffic flow forecasting suffer from high computational complexity (O(N²)) and poor deployability on real-world road networks. To address this, we propose an efficient fully MLP-based architecture. Our method introduces a hierarchical spatiotemporal mixing block—featuring bottom-up aggregation and top-down propagation—to enable multi-scale spatiotemporal modeling, and incorporates a region-semantic-aware adaptive mixer that dynamically captures spatial heterogeneity via learnable transformation matrices. Crucially, we eliminate both attention mechanisms and graph convolutions, substantially reducing computational overhead. Evaluated on four large-scale real-world traffic datasets, our model achieves state-of-the-art prediction accuracy with significantly fewer parameters and lower FLOPs, demonstrating both superior accuracy and strong practical applicability.

Technology Category

Application Category

📝 Abstract
Traffic forecasting task is significant to modern urban management. Recently, there is growing attention on large-scale forecasting, as it better reflects the complexity of real-world traffic networks. However, existing models often exhibit quadratic computational complexity, making them impractical for large-scale real-world scenarios. In this paper, we propose a novel framework, Hierarchical Spatio-Temporal Mixer (HSTMixer), which leverages an all-MLP architecture for efficient and effective large-scale traffic forecasting. HSTMixer employs a hierarchical spatiotemporal mixing block to extract multi-resolution features through bottom-up aggregation and top-down propagation. Furthermore, an adaptive region mixer generates transformation matrices based on regional semantics, enabling our model to dynamically capture evolving spatiotemporal patterns for different regions. Extensive experiments conducted on four large-scale real-world datasets demonstrate that the proposed method not only achieves state-of-the-art performance but also exhibits competitive computational efficiency.
Problem

Research questions and friction points this paper is trying to address.

Develops efficient large-scale traffic forecasting model
Reduces quadratic computational complexity in existing models
Captures dynamic spatiotemporal patterns across different regions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical MLP-Mixer for efficient large-scale traffic forecasting
Multi-resolution feature extraction via bottom-up and top-down propagation
Adaptive region mixer dynamically captures evolving spatiotemporal patterns
🔎 Similar Papers
No similar papers found.
Y
Yongyao Wang
School of Computer Science and Engineering, Beihang University, Beijing, China
J
Jingyuan Wang
School of Computer Science and Engineering, Beihang University, Beijing, China
X
Xie Yu
School of Computer Science and Engineering, Beihang University, Beijing, China
Jiahao Ji
Jiahao Ji
Beihang University | Nanyang Technological University
Spatio-temporal Data MiningPhysics-informed AIExplainable AI
C
Chao Li
School of Computer Science and Engineering, Beihang University, Beijing, China