GECO: Geometrically Consistent Embedding with Lightspeed Inference

📅 2025-08-01

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Current self-supervised vision foundation models lack geometric awareness of 3D structure, hindering fine-grained part discrimination based on geometric semantics (e.g., left/right eye, front/back leg). To address this, we propose a novel self-supervised learning framework grounded in optimal transport theory, which— for the first time—incorporates geometric consistency constraints directly into feature learning, enabling effective supervision over non-keypoint regions. We further introduce a new geometric quality metric that overcomes the limitations of conventional PCK in spatial continuity and scale robustness. Our method employs a lightweight network architecture, achieving both efficient inference (30 fps—98.2% faster than SOTA) and explicit geometric-aware modeling. On PFPascal, APK, and CUB benchmarks, it improves PCK by 6.0%, 6.2%, and 4.1%, respectively, establishing new state-of-the-art performance for fine-grained correspondence tasks.

Technology Category

Application Category

📝 Abstract

Recent advances in feature learning have shown that self-supervised vision foundation models can capture semantic correspondences but often lack awareness of underlying 3D geometry. GECO addresses this gap by producing geometrically coherent features that semantically distinguish parts based on geometry (e.g., left/right eyes, front/back legs). We propose a training framework based on optimal transport, enabling supervision beyond keypoints, even under occlusions and disocclusions. With a lightweight architecture, GECO runs at 30 fps, 98.2% faster than prior methods, while achieving state-of-the-art performance on PFPascal, APK, and CUB, improving PCK by 6.0%, 6.2%, and 4.1%, respectively. Finally, we show that PCK alone is insufficient to capture geometric quality and introduce new metrics and insights for more geometry-aware feature learning. Link to project page: https://reginehartwig.github.io/publications/geco/

Problem

Research questions and friction points this paper is trying to address.

Bridges gap between semantic and 3D geometric awareness in features

Enables robust feature learning under occlusions and disocclusions

Improves geometric accuracy while maintaining high inference speed

Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometrically coherent features via optimal transport

Lightweight architecture enabling 30 fps inference

New metrics for geometry-aware feature learning

🔎 Similar Papers

No similar papers found.