🤖 AI Summary
This study addresses the reliance on proprietary data and poor reproducibility in urban road-level traffic emission estimation by proposing a fully open-source data-driven framework. Methodologically, it integrates the MOVES emission model with open datasets—including OpenStreetMap road networks, open GPS trajectories, regional traffic statistics, and satellite-derived remote sensing features—to train a neural network that predicts vehicle operating mode distributions, enabling high-resolution estimation of CO, NOₓ, CO₂, and PM₂.₅ emissions at the road-link level. Its key contribution is the first demonstration of high-accuracy, reproducible fine-grained emission modeling using exclusively open-source data. Empirical validation across 45 municipalities in the Boston metropolitan area shows over 50% reduction in estimation error for major pollutants compared to the standard MOVES baseline, confirming both technical feasibility and scalability potential.
📝 Abstract
Open-source data offers a scalable and transparent foundation for estimating vehicle activity and emissions in urban regions. In this study, we propose a data-driven framework that integrates MOVES and open-source GPS trajectory data, OpenStreetMap (OSM) road networks, regional traffic datasets and satellite imagery-derived feature vectors to estimate the link level operating mode distribution and traffic emissions. A neural network model is trained to predict the distribution of MOVES-defined operating modes using only features derived from readily available data. The proposed methodology was applied using open-source data related to 45 municipalities in the Boston Metropolitan area. The "ground truth" operating mode distribution was established using OSM open-source GPS trajectories. Compared to the MOVES baseline, the proposed model reduces RMSE by over 50% for regional scale traffic emissions of key pollutants including CO, NOx, CO2, and PM2.5. This study demonstrates the feasibility of low-cost, replicable, and data-driven emissions estimation using fully open data sources.