π€ AI Summary
This work addresses perceptual uncertainty in open-vocabulary navigation arising from semantic ambiguity and model errors by proposing a novel approach based on 3D probabilistic scene graphs and a multi-universe decision mechanism. The method constructs a complete distribution over semantic categories, samples multiple plausible world states from their joint distribution, and evaluates the compatibility between navigational landmarks and each sampled state to enable globally optimal path planning. Innovatively, it introduces an evidence theoryβdriven empirical calibrator that leverages historical success and failure memories to online-correct perception outputs, facilitating lifelong adaptive learning. The approach achieves state-of-the-art performance with success rates of 66.1%, 44.8%, and 67.9% on the MP3D, HM3D, and HSSD benchmarks, respectively, significantly outperforming prior methods.
π Abstract
Open-vocabulary navigation requires embodied agents to manage significant perception uncertainty stemming from semantic ambiguity and model errors.
However, most existing works settle for local optimal deterministic approaches, depriving complex navigation decision-making over multiple composite possibilities that are critical for globally better solutions.
In this paper, we propose Probabilistic Scene Graph Navigation (PSG-Nav), which constructs a 3D Probabilistic Scene Graph that uses full semantic categorical distributions to account for perception uncertainty.
To efficiently use the local distributions to compose and reason about the optimal navigation landmarks, we propose Multiverse Decision to sample multiple most likely world settings from the joint distribution, and evaluate navigation landmarks based on the compatibility between landmarks and multiverses.
To mitigate false positives due to epistemic uncertainty in open-vocabulary navigation, we introduce the Evidential Experience Calibrator, which enables online lifelong adaptation by cross-validating detections against memories of past successes and failures.
Extensive experiments on widely-used benchmarks MP3D, HM3D, and HSSD demonstrate that PSG-Nav establishes new state-of-the-art results, achieving Success Rates of 66.1%, 44.8%, and 67.9%, respectively. Code is available at: https://psg-nav.github.io/