🤖 AI Summary
Terrain traversability prediction in unknown environments traditionally relies on costly field exploration, labeled datasets, and high-risk trial-and-error interactions. Method: This paper proposes a zero-shot vision-language collaborative navigation framework that integrates large language models (LLMs) into traversability reasoning for the first time. Leveraging multimodal in-context learning, it directly interprets environmental images to generate real-time traversability maps—without requiring training data or active robotic probing. Contribution/Results: By synergistically combining visual feature extraction with LLM-based commonsense reasoning, the method enables safe, robust, goal-directed navigation across diverse indoor and outdoor scenes. Experiments demonstrate a significant reduction in robot collision risk, consistent successful target reaching, and superior performance over state-of-the-art approaches—thereby breaking the paradigm dependency on supervised learning and physical interaction.
📝 Abstract
The advancement of robotics and autonomous navigation systems hinges on the ability to accurately predict terrain traversability. Traditional methods for generating datasets to train these prediction models often involve putting robots into potentially hazardous environments, posing risks to equipment and safety. To solve this problem, we present ZeST, a novel approach leveraging visual reasoning capabilities of Large Language Models (LLMs) to create a traversability map in real-time without exposing robots to danger. Our approach not only performs zero-shot traversability and mitigates the risks associated with real-world data collection but also accelerates the development of advanced navigation systems, offering a cost-effective and scalable solution. To support our findings, we present navigation results, in both controlled indoor and unstructured outdoor environments. As shown in the experiments, our method provides safer navigation when compared to other state-of-the-art methods, constantly reaching the final goal.