🤖 AI Summary
This study addresses the limitations of existing trajectory prediction datasets—often constrained by sensor centralization, narrow geographic coverage, or reliance on synthetic data—which hinder accurate modeling of real-world V2X communication dynamics. Leveraging the Modena Advanced Smart Area (MASA) in Italy, this work presents the first large-scale dataset derived from over 40 million Cooperative Awareness Messages (CAMs) and 2 million Decentralized Environmental Notification Messages (DENMs), collected over an extended period. Through preprocessing techniques including pseudonym linkage and temporal normalization to 10 Hz, the authors reconstruct high spatiotemporal resolution vehicle trajectories, effectively mitigating pseudonym switching induced by ETSI privacy mechanisms. Spanning more than 14,000 km of road network and encompassing tens of thousands of unique station IDs, the resulting dataset provides a statistically significant, real-world empirical foundation for research in trajectory prediction, cooperative intelligent transportation systems (C-ITS), and digital twin applications.
📝 Abstract
Trajectory prediction is a key enabler of autonomous and cooperative driving systems. However, most existing benchmarks are either sensor-centric, geographically constrained, or based on synthetic mobility traces that do not capture real-world V2X communication dynamics. This paper introduces CAMASA, a large-scale infrastructure-based dataset derived from Cooperative Awareness Messages (CAMs) and Decentralized Environmental Notification Messages (DENMs) collected within the Modena Automotive Smart Area (MASA). The dataset comprises more than 40 million CAMs and 2 million DENMs recorded under authentic urban traffic conditions over multiple months. We present a rigorous preprocessing pipeline that includes filtering, pseudonym reconciliation to account for ETSI privacy-driven stationID changes, and temporal normalization to 10 Hz trajectories, suitable for motion forecasting and time-series analysis. With over 14,000 km of reconstructed vehicle paths and tens of thousands of unique station IDs, CAMASA provides a statistically significant empirical foundation for research on Cooperative Intelligent Transportation Systems (C-ITS). Beyond trajectory prediction, the dataset enables calibration of microscopic urban traffic simulators (e.g., SUMO) and supports the development of realistic Intelligent Transportation Systems (ITS) Digital Twins by jointly modeling mobility patterns and V2X communication coverage in real deployments.