HOT-POT: Optimal Transport for Sparse Stereo Matching

📅 2026-01-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the ill-posed nature of sparse feature matching—such as facial landmarks—in stereo settings, where occlusion, motion, and camera distortion exacerbate ambiguities, particularly across divergent annotation protocols. To tackle this challenge, the authors propose a novel approach that integrates optimal transport with geometric constraints derived from multi-view camera geometry. By modeling image points as 3D rays and constructing a matching cost based on both epipolar and ray distances, the method formulates a partial optimal transport problem that can be solved efficiently. This framework is further extended into a hierarchical, unsupervised keypoint matching pipeline. Notably, it represents the first integration of optimal transport theory with multi-view geometry for cross-annotation landmark alignment, demonstrating robustness and practicality in sparse facial analysis scenarios.

Technology Category

Application Category

📝 Abstract
Stereo vision between images faces a range of challenges, including occlusions, motion, and camera distortions, across applications in autonomous driving, robotics, and face analysis. Due to parameter sensitivity, further complications arise for stereo matching with sparse features, such as facial landmarks. To overcome this ill-posedness and enable unsupervised sparse matching, we consider line constraints of the camera geometry from an optimal transport (OT) viewpoint. Formulating camera-projected points as (half)lines, we propose the use of the classical epipolar distance as well as a 3D ray distance to quantify matching quality. Employing these distances as a cost function of a (partial) OT problem, we arrive at efficiently solvable assignment problems. Moreover, we extend our approach to unsupervised object matching by formulating it as a hierarchical OT problem. The resulting algorithms allow for efficient feature and object matching, as demonstrated in our numerical experiments. Here, we focus on applications in facial analysis, where we aim to match distinct landmarking conventions.
Problem

Research questions and friction points this paper is trying to address.

sparse stereo matching
occlusions
camera distortions
unsupervised matching
facial landmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal Transport
Sparse Stereo Matching
Epipolar Geometry
Unsupervised Matching
Hierarchical OT
🔎 Similar Papers
No similar papers found.
A
Antonin Clerc
Institute of Mathematics, Technische Universität Berlin, Germany; Univ. Bordeaux, CNRS, Bordeaux INP, IMB, UMR 5251, F-33400 Talence, France
Michael Quellmalz
Michael Quellmalz
Technische Universität Berlin
Applied MathematicsInverse Problems
M
Moritz Piening
Institute of Mathematics, Technische Universität Berlin, Germany
Philipp Flotho
Philipp Flotho
Chair for Clinical Bioinformatics, Saarland Informatics Campus, Saarland University
Self-supervised LearningBiomedical ImagingOptical ImagingDeep LearningComputer Vision
G
Gregor Kornhardt
Institute of Mathematics, Technische Universität Berlin, Germany
Gabriele Steidl
Gabriele Steidl
TU Berlin
Computational harmonic analysisoptimizationimage processingmachine learning