Temporally Consistent Unsupervised Segmentation for Mobile Robot Perception

📅 2025-07-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of time-consistent semantic terrain segmentation for mobile robots operating in unstructured environments—where labeled data are scarce—this paper proposes an unsupervised video-level terrain segmentation method. The core innovation lies in the first incorporation of temporal consistency constraints into unsupervised terrain segmentation: robust superpixel-level features are extracted using foundation models (e.g., DINOv2), and cross-frame feature propagation coupled with consistency regularization drives clustering optimization to stably identify traversable regions and terrain boundaries. The method requires no human annotations. Evaluated on off-road benchmarks—including RUGD and RELLIS-3D—it achieves substantial improvements in segmentation accuracy (+8.2% mIoU) and temporal stability (+23.6% inter-frame IoU). This advances reliable perception for autonomous navigation in open, unstructured environments.

Technology Category

Application Category

📝 Abstract
Rapid progress in terrain-aware autonomous ground navigation has been driven by advances in supervised semantic segmentation. However, these methods rely on costly data collection and labor-intensive ground truth labeling to train deep models. Furthermore, autonomous systems are increasingly deployed in unrehearsed, unstructured environments where no labeled data exists and semantic categories may be ambiguous or domain-specific. Recent zero-shot approaches to unsupervised segmentation have shown promise in such settings but typically operate on individual frames, lacking temporal consistency-a critical property for robust perception in unstructured environments. To address this gap we introduce Frontier-Seg, a method for temporally consistent unsupervised segmentation of terrain from mobile robot video streams. Frontier-Seg clusters superpixel-level features extracted from foundation model backbones-specifically DINOv2-and enforces temporal consistency across frames to identify persistent terrain boundaries or frontiers without human supervision. We evaluate Frontier-Seg on a diverse set of benchmark datasets-including RUGD and RELLIS-3D-demonstrating its ability to perform unsupervised segmentation across unstructured off-road environments.
Problem

Research questions and friction points this paper is trying to address.

Unsupervised terrain segmentation lacks temporal consistency
Existing methods need costly labeled data for training
Ambiguous semantic categories in unstructured environments pose challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised segmentation using foundation model features
Temporal consistency enforcement across video frames
Superpixel-level clustering for terrain boundary identification
🔎 Similar Papers
No similar papers found.
Christian Ellis
Christian Ellis
Postdoctoral Researcher, University of Texas Austin
RoboticsSafety Assurance
Maggie Wigness
Maggie Wigness
US Army Research Laboratory
Computer VisionMachine LearningRobotics
C
Craig Lennon
DEVCOM Army Research Laboratory, Adelphi, MD, United States
L
Lance Fiondella
Department of Electrical and Computer Engineering, University of Massachusetts Dartmouth