SAFER-Nav: Enhancing Safety for Visual Robot Navigation via Segmentation-Aware Fine-Tuning

📅 2026-06-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing RGB-based visual navigation models exhibit limited generalization in unknown environments, often producing unsafe trajectories due to the absence of explicit modeling of obstacles and traversable regions. This work proposes a segmentation-aware fine-tuning mechanism that explicitly integrates semantic segmentation structure into the end-to-end navigation policy fine-tuning process, operating solely on RGB inputs and remaining compatible with diverse vision backbones. Without requiring external intervention, the method significantly enhances navigation safety, achieving substantially lower collision rates across multiple platforms and environments while maintaining high goal-reaching success rates. It outperforms state-of-the-art approaches including ViNT, NoMaD, and their CARE-augmented variants.

📝 Abstract

Vision-based navigation models, particularly foundation models, generate viable trajectories from RGB observations alone. However, even state-of-the-art transformer- and diffusion-based policies struggle to generalize in unfamiliar deployment environments containing unseen obstacles or shifted conditions. The resulting trajectories often remain goal-directed but unsafe. Existing efforts improve safety through external trajectory correction or internal geometric priors, yet the resulting policies are not trained to explicitly represent obstacle boundaries or traversable free-space structure. To address this, we propose a navigation model that incorporates these structures directly into the policy via fine-tuning and is designed to be compatible with diverse RGB-based backbones. Across multiple robot platforms, indoor environments, and static and dynamic obstacle scenarios, our method reduces collision frequency relative to ViNT, NoMaD, and their CARE-augmented variants while maintaining goal-reaching performance.

Problem

Research questions and friction points this paper is trying to address.

visual robot navigation

safety

obstacle generalization

traversable free-space

foundation models

Innovation

Methods, ideas, or system contributions that make the work stand out.

segmentation-aware fine-tuning

visual navigation

obstacle boundary representation