Unified Spatial-Temporal Edge-Enhanced Graph Networks for Pedestrian Trajectory Prediction

📅 2025-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pedestrian trajectory prediction suffers from fragmented modeling of spatial interactions and temporal dependencies, while neglecting higher-order cross-temporal interactions. To address this, we propose the Unified Spatio-Temporal Graph Neural Network (USTGNN), which pioneers a unified spatio-temporal graph representation that encodes both spatial and temporal relationships jointly, effectively reducing higher-order cross-temporal interactions to first-order edge relations. USTGNN introduces an Edge-to-Edge-Node-to-Node dual graph convolution mechanism to simultaneously model explicit social interactions and implicit edge-level influence propagation. Furthermore, it integrates a Transformer encoder to enhance long-range temporal dependency modeling. Evaluated on standard benchmarks—ETH, UCY, and SDD—our method achieves significant improvements in average and final displacement errors (ADE/FDE) and demonstrates superior long-term trajectory consistency, comprehensively outperforming state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Pedestrian trajectory prediction aims to forecast future movements based on historical paths. Spatial-temporal (ST) methods often separately model spatial interactions among pedestrians and temporal dependencies of individuals. They overlook the direct impacts of interactions among different pedestrians across various time steps (i.e., high-order cross-time interactions). This limits their ability to capture ST inter-dependencies and hinders prediction performance. To address these limitations, we propose UniEdge with three major designs. Firstly, we introduce a unified ST graph data structure that simplifies high-order cross-time interactions into first-order relationships, enabling the learning of ST inter-dependencies in a single step. This avoids the information loss caused by multi-step aggregation. Secondly, traditional GNNs focus on aggregating pedestrian node features, neglecting the propagation of implicit interaction patterns encoded in edge features. We propose the Edge-to-Edge-Node-to-Node Graph Convolution (E2E-N2N-GCN), a novel dual-graph network that jointly models explicit N2N social interactions among pedestrians and implicit E2E influence propagation across these interaction patterns. Finally, to overcome the limited receptive fields and challenges in capturing long-range dependencies of auto-regressive architectures, we introduce a transformer encoder-based predictor that enables global modeling of temporal correlation. UniEdge outperforms state-of-the-arts on multiple datasets, including ETH, UCY, and SDD.
Problem

Research questions and friction points this paper is trying to address.

Unified ST graph simplifies high-order interactions
E2E-N2N-GCN models explicit and implicit interactions
Transformer encoder captures long-range temporal dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified ST graph structure
Edge-to-Edge-Node-to-Node GCN
Transformer encoder predictor
🔎 Similar Papers
No similar papers found.