SR-Nav: Spatial Relationships Matter for Zero-shot Object Goal Navigation

📅 2026-03-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of zero-shot object-goal navigation, where existing methods struggle with reliable perception and planning under limited viewpoints or weak semantic cues. To overcome this, the authors propose SR-Nav, a novel framework that introduces a goal-centric Dynamic Spatial Relationship Graph (DSRG)—the first of its kind—to encode observations and empirical spatial relationships using foundation models. SR-Nav incorporates a relation-matching module to enhance perceptual robustness and a dynamic relation-planning module to optimize path search. By effectively integrating visual perception with structured scene priors, the method significantly improves reasoning capability, success rate, and navigation efficiency under partial observability on the HM3D benchmark, demonstrating the critical role of explicit spatial relationship modeling in zero-shot navigation.

Technology Category

Application Category

📝 Abstract

Zero-shot object-goal navigation aims to find target objects in unseen environments using only egocentric observation. Recent methods leverage foundation models' comprehension and reasoning capabilities to enhance navigation performance. However, when faced with poor viewpoints or weak semantic cues, foundation models often fail to support reliable reasoning in both perception and planning, resulting in inefficient or failed navigation. We observe that inherent relationships among objects and regions encode structured scene priors, which help agents infer plausible target locations even under partial observations. Motivated by this insight, we propose Spatial Relation-aware Navigation (SR-Nav), a framework that models both observed and experience-based spatial relationships to enhance both perception and planning. Specifically, SR-Nav first constructs a Dynamic Spatial Relationship Graph (DSRG) that encodes the target-centered spatial relationships through the foundation models and updates dynamically with real-time observations. We then introduce a Relation-aware Matching Module. It utilizes relationship matching instead of naive detection, leveraging diverse relationships in the DSRG to verify and correct errors, enhancing visual perception robustness. Finally, we design a Dynamic Relationship Planning Module to reduce the planning search space by dynamically computing the optimal paths based on the DSRG from the current position, thereby guiding planning and reducing exploration redundancy. Experiments on HM3D show that our method achieves state-of-the-art performance in both success rate and navigation efficiency. The code will be publicly available at https://github.com/Mzyw-1314/SR-Nav

Problem

Research questions and friction points this paper is trying to address.

zero-shot object-goal navigation

spatial relationships

foundation models

visual perception

navigation planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

spatial relationships

zero-shot navigation

foundation models

dynamic graph

relation-aware planning

🔎 Similar Papers

No similar papers found.

Authors to Follow