TARIC: Memory-Augmented Traversability-Aware Outdoor VLN under Interrupted Semantic Cues

📅 2026-05-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

198K/year
🤖 AI Summary
This work addresses the challenge of disorientation in long-range outdoor vision-and-language navigation caused by sparse semantic cues, occlusions, or targets moving out of view. To this end, the authors propose a novel 3D memory mechanism that integrates real-time traversability analysis with world-coordinate alignment. For the first time, traversability is explicitly leveraged as a stability condition to sustain goal-directed navigation, complemented by an uncertainty-aware memory retrieval strategy that continuously generates feasible and goal-consistent instructions even during periods of semantic cue absence. Key technical components include visibility-gated semantic bearing extraction, near-field traversability modeling, and aligned 3D memory storage. Evaluated on real-world and simulated routes spanning 600–1000 meters, the method improves simulation success rates by over 10 percentage points and achieves a 40% success rate in real-robot trials, substantially outperforming existing baselines.
📝 Abstract
Outdoor vision-language navigation (VLN) in long-range, open-world environments is frequently disrupted by semantic-cue interruptions, where informative goal cues become sparse, occluded, or leave the field of view. Once such cues disappear, agents enter a cue-free phase and often degrade into backtracking, oscillatory headings, or aimless exploration. While memory-based methods attempt to bridge these gaps, they often fail under traversability-driven detours: the remembered cue direction may be infeasible, forcing detours that prolong cue-free phases and gradually render robot-centric cues stale and implicit histories blurred. This makes traversability a stability condition for maintaining goal-directed guidance, rather than merely a local safety concern. We propose a unified outdoor VLN framework that survives semantic-cue interruptions by maintaining traversability-consistent executable guidance throughout prolonged cue-free phases. Specifically, our method extracts semantic bearings from visibility-gated goal or exploration cues and grounds them into executable headings using a real-time near-field traversability profile, providing goal-consistent feasible guidance beyond reject-only safety filtering. To prevent guidance degradation during detours, we lift intermittent 2D evidence into a world-aligned 3D cue memory with an uncertainty-aware readout mechanism, ensuring guidance remains continuously reachable and stable as the robot moves. We evaluate the framework on quadrupedal and wheeled platforms over 600--1000 m routes. Our method improves simulation success rate by over 10 percentage points over the strongest baseline and achieves a real-world success rate of 40%, compared to 17.5% for the strongest baseline, with substantially higher robustness during prolonged cue-free intervals.
Problem

Research questions and friction points this paper is trying to address.

vision-language navigation
semantic-cue interruption
traversability
outdoor navigation
memory degradation
Innovation

Methods, ideas, or system contributions that make the work stand out.

traversability-aware navigation
memory-augmented VLN
semantic-cue interruption
3D cue memory
executable guidance
T
Tianle Zeng
Shenzhen Key Laboratory of Robotics and Computer Vision, Southern University of Science and Technology, Shenzhen, China.
Hanjing Ye
Hanjing Ye
PhD Student at Southern University of Science and Technology
Robot Person FollowingPlace Recognition
J
Jianwei Peng
Shenzhen Key Laboratory of Robotics and Computer Vision, Southern University of Science and Technology, Shenzhen, China.
J
Jingwen Yu
Shenzhen Key Laboratory of Robotics and Computer Vision, Southern University of Science and Technology, Shenzhen, China.; CKS Robotics Institute, Hong Kong University of Science and Technology, Hong Kong SAR, China.
H
Hanxuan Chen
College of Electrical and Information Engineering, Hunan University, Hunan, China.
Hong Zhang
Hong Zhang
School of Cybersecurity and Computer Science, Hebei University
Big DataEdge ComputingInformation SecurityAI