🤖 AI Summary
Existing spatiotemporal crowd flow prediction (STCFP) studies suffer from incomplete contextual feature coverage and inconsistent modeling paradigms, leading to poor model comparability and limited robustness. To address these issues, this work introduces STContext—the first comprehensive, multi-scenario benchmark dataset for STCFP—featuring five prediction tasks, nine diverse geographic regions, and ten heterogeneous contextual features (e.g., weather, AQI, POIs, road networks). We propose the first standardized workflow for contextual feature fusion: feature transformation → dependency modeling → representation fusion → training strategy—and establish a principled contextual modeling paradigm with empirically grounded guidelines for STCFP. Leveraging graph neural networks and multi-source data alignment techniques, we systematically evaluate multiple fusion strategies, distill key practical principles, and demonstrate significant improvements in prediction accuracy, robustness, and interpretability across diverse settings.
📝 Abstract
In smart cities, context-aware spatio-temporal crowd flow prediction (STCFP) models leverage contextual features (e.g., weather) to identify unusual crowd mobility patterns and enhance prediction accuracy. However, the best practice for incorporating contextual features remains unclear due to inconsistent usage of contextual features in different papers. Developing a multifaceted dataset with rich types of contextual features and STCFP scenarios is crucial for establishing a principled context modeling paradigm. Existing open crowd flow datasets lack an adequate range of contextual features, which poses an urgent requirement to build a multifaceted dataset to fill these research gaps. To this end, we create STContext, a multifaceted dataset for developing context-aware STCFP models. Specifically, STContext provides nine spatio-temporal datasets across five STCFP scenarios and includes ten contextual features, including weather, air quality index, holidays, points of interest, road networks, etc. Besides, we propose a unified workflow for incorporating contextual features into deep STCFP methods, with steps including feature transformation, dependency modeling, representation fusion, and training strategies. Through extensive experiments, we have obtained several useful guidelines for effective context modeling and insights for future research. The STContext is open-sourced at https://github.com/Liyue-Chen/STContext.