🤖 AI Summary
Visual place recognition (VPR) suffers from significant performance degradation due to cross-temporal appearance variations induced by seasonal changes, weather conditions, illumination shifts, and dynamic objects—leading to highly variable matching score distributions and poor generalization of manually set fixed thresholds. To address this, we propose an adaptive threshold selection method based on a Gaussian Mixture Model (GMM) fitted to negative-sample matching scores—i.e., scores from non-matching image pairs. Unlike prior approaches, our method explicitly models uncertainty using only negative samples, enabling automatic derivation of scene- and descriptor-agnostic robust decision thresholds without manual parameter tuning. The method is lightweight and directly applicable as a post-processing step in VPR pipelines. Extensive experiments across standard VPR benchmarks (e.g., Nordland, SPED) and diverse feature descriptors (e.g., NetVLAD, CosPlace) demonstrate consistent and significant improvements in recall and area under the ROC curve (AUC), substantially enhancing the robustness of robotic deployment in open-world environments.
📝 Abstract
Visual place recognition (VPR) is an important component technology for camera-based mapping and navigation applications. This is a challenging problem because images of the same place may appear quite different for reasons including seasonal changes, weather illumination, structural changes to the environment, as well as transient pedestrian or vehicle traffic. Papers focusing on generating image descriptors for VPR report their results using metrics such as recall@K and ROC curves. However, for a robot implementation, determining which matches are sufficiently good is often reduced to a manually set threshold. And it is difficult to manually select a threshold that will work for a variety of visual scenarios. This paper addresses the problem of automatically selecting a threshold for VPR by looking at the 'negative' Gaussian mixture statistics for a place - image statistics indicating not this place. We show that this approach can be used to select thresholds that work well for a variety of image databases and image descriptors.