🤖 AI Summary
To address the challenge of modeling correspondences in panoramic dense matching—exacerbated by inherent distortions in equirectangular projection (ERP)—this paper proposes the first end-to-end learning framework grounded in spherical geometry. Our method replaces conventional planar positional embeddings with a novel spherical positional encoding based on 3D Cartesian coordinates of the unit sphere. It further introduces a bidirectional spherical–Cartesian coordinate transformation mechanism to enable distortion-free feature mapping, and incorporates a geodesic flow optimization module for fine-grained matching refinement directly on the sphere. Evaluated on Matterport3D and Stanford2D3D, our approach achieves state-of-the-art performance, improving AUC@5° by 26.72 and 42.62 points, respectively. These gains demonstrate substantial mitigation of ERP-induced geometric distortion, establishing a new benchmark for panoramic dense matching.
📝 Abstract
We introduce the first learning-based dense matching algorithm, termed Equirectangular Projection-Oriented Dense Kernelized Feature Matching (EDM), specifically designed for omnidirectional images. Equirectangular projection (ERP) images, with their large fields of view, are particularly suited for dense matching techniques that aim to establish comprehensive correspondences across images. However, ERP images are subject to significant distortions, which we address by leveraging the spherical camera model and geodesic flow refinement in the dense matching method. To further mitigate these distortions, we propose spherical positional embeddings based on 3D Cartesian coordinates of the feature grid. Additionally, our method incorporates bidirectional transformations between spherical and Cartesian coordinate systems during refinement, utilizing a unit sphere to improve matching performance. We demonstrate that our proposed method achieves notable performance enhancements, with improvements of +26.72 and +42.62 in AUC@5{deg} on the Matterport3D and Stanford2D3D datasets.