Vision-based Perception System for Automated Delivery Robot-Pedestrians Interactions

📅 2025-08-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of safe, efficient, and socially acceptable navigation for autonomous delivery robots in dense urban pedestrian environments, this paper proposes a monocular vision–driven framework that unifies multi-pedestrian perception and behavioral understanding. Methodologically, it integrates human pose estimation, monocular depth estimation, and multi-object tracking to construct an identity-robust trajectory prediction model, augmented by a vulnerable pedestrian identification mechanism to enhance social awareness. The key contribution lies in leveraging joint pose–depth cues to improve identity maintenance accuracy under occlusion and ensure long-term trajectory consistency. Evaluated on the MOT17 benchmark, the approach achieves a 10% gain in IDF1 and a 7% improvement in MOTA, while maintaining detection accuracy above 85%. These results demonstrate significantly enhanced navigation reliability and socially inclusive interaction capability in high-density, heavily occluded scenarios.

Technology Category

Application Category

📝 Abstract
The integration of Automated Delivery Robots (ADRs) into pedestrian-heavy urban spaces introduces unique challenges in terms of safe, efficient, and socially acceptable navigation. We develop the complete pipeline for a single vision sensor based multi-pedestrian detection and tracking, pose estimation, and monocular depth perception. Leveraging the real-world MOT17 dataset sequences, this study demonstrates how integrating human-pose estimation and depth cues enhances pedestrian trajectory prediction and identity maintenance, even under occlusions and dense crowds. Results show measurable improvements, including up to a 10% increase in identity preservation (IDF1), a 7% improvement in multiobject tracking accuracy (MOTA), and consistently high detection precision exceeding 85%, even in challenging scenarios. Notably, the system identifies vulnerable pedestrian groups supporting more socially aware and inclusive robot behaviour.
Problem

Research questions and friction points this paper is trying to address.

Safe navigation of delivery robots in pedestrian-heavy urban spaces
Multi-pedestrian detection and tracking using a single vision sensor
Enhancing pedestrian trajectory prediction under occlusions and dense crowds
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-based multi-pedestrian detection and tracking
Human-pose estimation with monocular depth perception
Enhanced trajectory prediction under occlusions
🔎 Similar Papers
No similar papers found.
E
Ergi Tushe
Laboratory of Innovations in Transportation (LiTrans) and Data Science Program, Toronto Metropolitan University, Canada, Toronto
Bilal Farooq
Bilal Farooq
Laboratory of Innovations in Transportation (LiTrans), Toronto Metropolitan University
SimulationBehavioural ModellingMachine LearningIntelligent SystemsSmart Cities