Contrast&Compress: Learning Lightweight Embeddings for Short Trajectories

๐Ÿ“… 2025-06-03
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Retrieving short trajectories with dual similarity in semantics and direction remains challenging due to the insensitivity of conventional metrics (e.g., FFT) to directional characteristics. Method: This paper proposes a lightweight contrastive learning embedding framework leveraging a Transformer encoder jointly optimized with triplet loss, and introducesโ€” for the first timeโ€”a Cosine-based contrastive objective incorporating directional intent modeling. The learned embeddings achieve high discriminability and interpretability while supporting real-time inference, with dimensionality as low as 4D. Results: Evaluated on Argoverse 2, the method significantly improves minADE and minFDE. Even at 4D, it maintains superior retrieval performance while drastically reducing computational overhead, enabling real-time motion prediction and deployment in autonomous navigation systems.

Technology Category

Application Category

๐Ÿ“ Abstract
The ability to retrieve semantically and directionally similar short-range trajectories with both accuracy and efficiency is foundational for downstream applications such as motion forecasting and autonomous navigation. However, prevailing approaches often depend on computationally intensive heuristics or latent anchor representations that lack interpretability and controllability. In this work, we propose a novel framework for learning fixed-dimensional embeddings for short trajectories by leveraging a Transformer encoder trained with a contrastive triplet loss that emphasize the importance of discriminative feature spaces for trajectory data. We analyze the influence of Cosine and FFT-based similarity metrics within the contrastive learning paradigm, with a focus on capturing the nuanced directional intent that characterizes short-term maneuvers. Our empirical evaluation on the Argoverse 2 dataset demonstrates that embeddings shaped by Cosine similarity objectives yield superior clustering of trajectories by both semantic and directional attributes, outperforming FFT-based baselines in retrieval tasks. Notably, we show that compact Transformer architectures, even with low-dimensional embeddings (e.g., 16 dimensions, but qualitatively down to 4), achieve a compelling balance between retrieval performance (minADE, minFDE) and computational overhead, aligning with the growing demand for scalable and interpretable motion priors in real-time systems. The resulting embeddings provide a compact, semantically meaningful, and efficient representation of trajectory data, offering a robust alternative to heuristic similarity measures and paving the way for more transparent and controllable motion forecasting pipelines.
Problem

Research questions and friction points this paper is trying to address.

Learning lightweight embeddings for short trajectories efficiently
Improving interpretability and controllability in trajectory similarity metrics
Balancing retrieval performance and computational overhead in real-time systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer encoder with contrastive triplet loss
Cosine similarity for discriminative embeddings
Compact low-dimensional trajectory representations
๐Ÿ”Ž Similar Papers
No similar papers found.
A
A. Vivekanandan
FZI Research Center for Information Technology, Karlsruhe, Germany; Karlsruhe Institute of Technology (KIT), Germany
Christian Hubschneider
Christian Hubschneider
Research Scientist, FZI Forschungszentrum Informatik
Autonomous DrivingMachine Learning
J
J. Zollner
FZI Research Center for Information Technology, Karlsruhe, Germany; Karlsruhe Institute of Technology (KIT), Germany