DisPlace: Discriminative Place Projections for Multi-Reference Visual Place Recognition

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Existing visual place recognition (VPR) methods struggle to distinguish between nuisance variations caused by environmental or viewpoint changes and genuine place-identifying features when fusing multiple reference traversals, leading to limited robustness. This work proposes DisPlace, a novel framework that explicitly models feature variability across reference traversals and reformulates multi-reference fusion as a generalized eigenvalue problem. By learning a discriminative subspace projection, DisPlace maximizes inter-place separability while suppressing intra-place variation within a compact representation. Integrated with six state-of-the-art VPR descriptors, the method outperforms seven baselines in 49 out of 54 appearance-change scenarios across four benchmark datasets, achieving significant gains under viewpoint shifts and in unstructured environments, all while reducing both inference latency and storage overhead.

📝 Abstract

A key challenge in Visual Place Recognition (VPR) is matching query images against reference maps captured under diverse environmental conditions and viewpoints. While multiple reference traversals improve robustness, existing fusion strategies either aggregate references uniformly or rely on heuristic selection, without distinguishing descriptor variations that preserve stable place identity from those caused by changing conditions or viewpoints. In this paper, we propose DisPlace, a multi-reference VPR framework that fuses multiple reference descriptors into a single compact and discriminative place representation. DisPlace formulates descriptor fusion as a generalized eigenvalue problem that maximizes between-place separability while suppressing within-place variation across references, rather than preserving overall descriptor variance. Unlike existing multi-reference fusion methods, DisPlace exploits variation across reference traversals to identify which linear combinations of descriptor dimensions preserve place identity and which capture condition- or viewpoint-specific variation. We evaluate DisPlace on Oxford RobotCar, Nordland, Pittsburgh30k, and Google Landmarks v2 across six state-of-the-art VPR descriptors. DisPlace outperforms seven multi-reference baselines in 49 out of 54 appearance-varying conditions, consistently improves descriptor-level fusion performance under viewpoint and unstructured settings, and requires less storage during inference than all compared fusion methods.

Problem

Research questions and friction points this paper is trying to address.

Visual Place Recognition

multi-reference fusion

descriptor variation

place identity

environmental conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Visual Place Recognition

multi-reference fusion

discriminative representation