🤖 AI Summary
This work addresses the challenge of modeling complex relative motion between the ego-vehicle and pedestrians in egocentric pedestrian trajectory prediction. To this end, it introduces the Mamba architecture to this task for the first time and proposes an ego-motion-guided trajectory prediction network. The method employs dual Mamba encoders to separately capture temporal features of both pedestrians and the ego-vehicle, and designs an ego-motion-guided Mamba decoder that explicitly models their dynamic spatial–temporal relationships to generate future trajectories. Experimental results on the PIE and JAAD datasets demonstrate that the proposed approach achieves state-of-the-art performance, confirming its effectiveness and superiority in dynamic driving scenarios.
📝 Abstract
Future trajectory prediction of a tracked pedestrian from an egocentric perspective is a key task in areas such as autonomous driving and robot navigation. The challenge of this task lies in the complex dynamic relative motion between the ego-camera and the tracked pedestrian. To address this challenge, we propose an ego-motion-guided trajectory prediction network based on the Mamba model. Firstly, two Mamba models are used as encoders to extract pedestrian motion and ego-motion features from pedestrian movement and ego-vehicle movement, respectively. Then, an ego-motion guided Mamba decoder that explicitly models the relative motion between the pedestrian and the vehicle by integrating pedestrian motion features as historical context with ego-motion features as guiding cues to capture decoded features. Finally, the future trajectory is generated from the decoded features corresponding to the future timestamps. Extensive experiments demonstrate the effectiveness of the proposed model, which achieves state-of-the-art performance on the PIE and JAAD datasets.