🤖 AI Summary
Autonomous taxis struggle to reliably detect and interpret complex crossing behaviors of vulnerable road users (VRUs) within urban operational design domains (ODDs).
Method: This paper proposes the Pattern-based Classification and Identification Framework (PCICF), a systematic scene recognition and classification framework. It introduces MoreSMIRK—a structured multi-pedestrian crossing scenario dictionary—and innovatively employs space-filling curves (SFCs) to achieve dimensionality reduction and pattern matching on high-dimensional scene features, effectively modeling collective dynamics such as merging and splitting. The dictionary is constructed using the synthetic SMIRK dataset and validated on the real-world PIE dataset.
Contribution/Results: PCICF achieves efficient identification and classification of complex VRU crossing scenarios with real-time computational performance and potential for in-vehicle deployment. Its core contributions are: (1) the first structured dictionary for VRU group behavior tailored to ODD event analysis, and (2) a lightweight, SFC-based multimodal scene-matching paradigm. Code is publicly available.
📝 Abstract
We have recently observed the commercial roll-out of robotaxis in various countries. They are deployed within an operational design domain (ODD) on specific routes and environmental conditions, and are subject to continuous monitoring to regain control in safety-critical situations. Since ODDs typically cover urban areas, robotaxis must reliably detect vulnerable road users (VRUs) such as pedestrians, bicyclists, or e-scooter riders. To better handle such varied traffic situations, end-to-end AI, which directly compute vehicle control actions from multi-modal sensor data instead of only for perception, is on the rise. High quality data is needed for systematically training and evaluating such systems within their OOD. In this work, we propose PCICF, a framework to systematically identify and classify VRU situations to support ODD's incident analysis. We base our work on the existing synthetic dataset SMIRK, and enhance it by extending its single-pedestrian-only design into the MoreSMIRK dataset, a structured dictionary of multi-pedestrian crossing situations constructed systematically. We then use space-filling curves (SFCs) to transform multi-dimensional features of scenarios into characteristic patterns, which we match with corresponding entries in MoreSMIRK. We evaluate PCICF with the large real-world dataset PIE, which contains more than 150 manually annotated pedestrian crossing videos. We show that PCICF can successfully identify and classify complex pedestrian crossings, even when groups of pedestrians merge or split. By leveraging computationally efficient components like SFCs, PCICF has even potential to be used onboard of robotaxis for OOD detection for example. We share an open-source replication package for PCICF containing its algorithms, the complete MoreSMIRK dataset and dictionary, as well as our experiment results presented in: https://github.com/Claud1234/PCICF