Closed-Circuit Television Data as an Emergent Data Source for Urban Rail Platform Crowding Estimation

📅 2025-08-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of real-time, high-accuracy platform crowd density estimation in urban rail transit. Methodologically, it proposes a novel density estimation algorithm integrating semantic segmentation with linear optimization: it models passenger depth distribution from deep learning–generated segmentation maps and employs linear optimization for precise counting. The framework fuses multiple state-of-the-art models—including YOLOv11, RT-DETRv2, Crowd-ViT, DeepLabV3, and APCGG—and incorporates a privacy-preserving video processing mechanism. Evaluated on over 600 hours of real-world CCTV footage from the Washington Metropolitan Area Transit Authority (WMATA), the method achieves centimeter-level spatial granularity and sub-second temporal responsiveness without auxiliary sensors. It reduces mean absolute error by 23.7% over existing benchmarks. The approach delivers a robust, production-ready visual perception foundation for intelligent dispatching, emergency response, and passenger service enhancement.

Technology Category

Application Category

📝 Abstract
Accurately estimating urban rail platform occupancy can enhance transit agencies' ability to make informed operational decisions, thereby improving safety, operational efficiency, and customer experience, particularly in the context of crowding. However, sensing real-time crowding remains challenging and often depends on indirect proxies such as automatic fare collection data or staff observations. Recently, Closed-Circuit Television (CCTV) footage has emerged as a promising data source with the potential to yield accurate, real-time occupancy estimates. The presented study investigates this potential by comparing three state-of-the-art computer vision approaches for extracting crowd-related features from platform CCTV imagery: (a) object detection and counting using YOLOv11, RT-DETRv2, and APGCC; (b) crowd-level classification via a custom-trained Vision Transformer, Crowd-ViT; and (c) semantic segmentation using DeepLabV3. Additionally, we present a novel, highly efficient linear-optimization-based approach to extract counts from the generated segmentation maps while accounting for image object depth and, thus, for passenger dispersion along a platform. Tested on a privacy-preserving dataset created in collaboration with the Washington Metropolitan Area Transit Authority (WMATA) that encompasses more than 600 hours of video material, our results demonstrate that computer vision approaches can provide substantive value for crowd estimation. This work demonstrates that CCTV image data, independent of other data sources available to a transit agency, can enable more precise real-time crowding estimation and, eventually, timely operational responses for platform crowding mitigation.
Problem

Research questions and friction points this paper is trying to address.

Estimating urban rail platform crowding accurately
Using CCTV footage for real-time occupancy estimates
Comparing computer vision methods for crowd feature extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses YOLOv11, RT-DETRv2 for object detection
Employs Crowd-ViT for crowd classification
Applies DeepLabV3 with linear-optimization for segmentation
🔎 Similar Papers
No similar papers found.
Riccardo Fiorista
Riccardo Fiorista
Master of Science in Transportation Student, Massachusetts Institute of Technology
public transportationequityaccessibility
A
Awad Abdelhalim
Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Anson F. Stewart
Anson F. Stewart
Massachusetts Institute of Technology
Urban PlanningPublic TransportationSpatial AnalyticsAccessibility
G
Gabriel L. Pincus
Washington Metropolitan Area Transit Authority, 300 7th Street SW, Washington, DC 20024, USA
I
Ian Thistle
Washington Metropolitan Area Transit Authority, 300 7th Street SW, Washington, DC 20024, USA
J
Jinhua Zhao
Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA 02139, USA