XXLTraffic: Expanding and Extremely Long Traffic Dataset for Ultra-Dynamic Forecasting Challenges

📅 2024-06-18
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing public traffic forecasting datasets inadequately capture ultra-dynamic characteristics prevalent in real-world scenarios—such as infrastructure evolution, temporal distribution shifts, and prolonged sensor failures—thereby hindering research on ultra-long-horizon forecasting and test-time adaptation. To address this, we introduce TrajNet, the largest publicly available traffic dataset to date, featuring the longest temporal span and continuously expanding node count. TrajNet explicitly models strong dynamics via two novel benchmark configurations: synthetic temporal gaps and training-set downsampling. Leveraging multi-source sensor data cleaning, spatiotemporal alignment, and progressive expansion techniques, it supports both hourly and daily sequence modeling. Empirical evaluation demonstrates that TrajNet significantly enhances model assessment capabilities for ultra-long-horizon prediction, missing-value imputation, and robustness to distributional shifts. By enabling systematic study of ultra-dynamic behaviors, TrajNet advances traffic forecasting toward an ultra-dynamic paradigm.

Technology Category

Application Category

📝 Abstract
Traffic forecasting is crucial for smart cities and intelligent transportation initiatives, where deep learning has made significant progress in modeling complex spatio-temporal patterns in recent years. However, current public datasets have limitations in reflecting the ultra-dynamic nature of real-world scenarios, characterized by continuously evolving infrastructures, varying temporal distributions, and temporal gaps due to sensor downtimes or changes in traffic patterns. These limitations inevitably restrict the practical applicability of existing traffic forecasting datasets. To bridge this gap, we present XXLTraffic, the largest available public traffic dataset with the longest timespan and increasing number of sensor nodes over the multiple years observed in the data, curated to support research in ultra-dynamic forecasting. Our benchmark includes both typical time-series forecasting settings with hourly and daily aggregated data and novel configurations that introduce gaps and down-sample the training size to better simulate practical constraints. We anticipate the new XXLTraffic will provide a fresh perspective for the time-series and traffic forecasting communities. It would also offer a robust platform for developing and evaluating models designed to tackle ultra-dynamic and extremely long forecasting problems. Our dataset supplements existing spatio-temporal data resources and leads to new research directions in this domain.
Problem

Research questions and friction points this paper is trying to address.

Addresses limitations in current traffic datasets for real-world distribution shifts
Introduces XXLTraffic for extremely long forecasting beyond test adaptation
Provides benchmark with practical constraints like gaps and down-sampled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Largest public traffic dataset with longest timespan
Includes gaps and down-sampled training data
Supports extremely long forecasting research
🔎 Similar Papers
No similar papers found.
D
Du Yin
University of New South Wales
Hao Xue
Hao Xue
University of New South Wales
human mobilityspatio-temporal data mining
Arian Prabowo
Arian Prabowo
University of New South Wales
SpatiotemporalforecastingGNNcontrastive learninggeometric deep learning.
S
Shuang Ao
University of New South Wales
F
Flora D. Salim
University of New South Wales