ASTRA: A Scene-aware TRAnsformer-based model for trajectory prediction

📅 2025-01-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient pedestrian trajectory prediction accuracy in autonomous driving, this paper proposes a lightweight Graph-Aware Transformer (GAT) model. Methodologically: (i) we introduce a novel agent-scene-aware embedding mechanism that jointly models scene context, spatial dynamics, social interactions, and temporal evolution; (ii) we design a weighted penalty loss function to prioritize short-term prediction accuracy and mitigate error accumulation; and (iii) the model supports cross-view generalization between bird’s-eye view (BEV) and ego-vehicle view (EVV). The architecture comprises a U-Net-based feature extractor, a graph-aware Transformer encoder, and a conditional variational autoencoder (CVAE) decoder, enabling both deterministic and stochastic predictions. On ETH-UCY, our method reduces average displacement error (ADE) and final displacement error (FDE) by 27% and 10%, respectively; on PIE, ADE improves by 26%. With only 1/7 the parameters of state-of-the-art models, it achieves significant gains in computational efficiency and cross-dataset generalization.

Technology Category

Application Category

📝 Abstract
We present ASTRA (A} Scene-aware TRAnsformer-based model for trajectory prediction), a light-weight pedestrian trajectory forecasting model that integrates the scene context, spatial dynamics, social inter-agent interactions and temporal progressions for precise forecasting. We utilised a U-Net-based feature extractor, via its latent vector representation, to capture scene representations and a graph-aware transformer encoder for capturing social interactions. These components are integrated to learn an agent-scene aware embedding, enabling the model to learn spatial dynamics and forecast the future trajectory of pedestrians. The model is designed to produce both deterministic and stochastic outcomes, with the stochastic predictions being generated by incorporating a Conditional Variational Auto-Encoder (CVAE). ASTRA also proposes a simple yet effective weighted penalty loss function, which helps to yield predictions that outperform a wide array of state-of-the-art deterministic and generative models. ASTRA demonstrates an average improvement of 27%/10% in deterministic/stochastic settings on the ETH-UCY dataset, and 26% improvement on the PIE dataset, respectively, along with seven times fewer parameters than the existing state-of-the-art model (see Figure 1). Additionally, the model's versatility allows it to generalize across different perspectives, such as Bird's Eye View (BEV) and Ego-Vehicle View (EVV).
Problem

Research questions and friction points this paper is trying to address.

Pedestrian trajectory prediction
Autonomous driving
Behavior pattern recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

U-Net and Graph-aware Transformer
CVAE for Predictive Uncertainty
Weighted Loss Function Optimization
🔎 Similar Papers
No similar papers found.
Izzeddin Teeti
Izzeddin Teeti
Oxford Brookes University
Autonomous VehiclesComputer VisionMachine LearningTheory of Mind
A
Aniket Thomas
Indian Institute of Technology Bombay, India
M
M. Monga
Indian Institute of Technology Bombay, India
S
Sachin Kumar
Indian Institute of Technology Bombay, India
U
Uddeshya Singh
Indian Institute of Technology Bombay, India
Andrew Bradley
Andrew Bradley
Autonomous Driving & Intelligent Transport group
B
Biplab Banerjee
Indian Institute of Technology Bombay, India
Fabio Cuzzolin
Fabio Cuzzolin
Professor of Artificial Intelligence, Oxford Brookes University
Artificial IntelligenceImprecise ProbabilitiesBelief FunctionsComputer VisionMachine Learning