Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks

πŸ“… 2025-12-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of jointly modeling semantic and geometric information in pixel-level representation learning, aiming to achieve precise cross-image point correspondence without momentum-based mechanisms. We propose a novel stable contrastive loss that eliminates the conventional momentum teacher-student architecture and enables, for the first time, end-to-end, single-pixel-level joint semantic-geometric representation learning. Our method leverages overcomplete feature maps and pixel-wise contrastive learning, trained self-supervisedly on synthetic 2D/3D environments. Experimental results demonstrate that the learned representations exhibit both strong semantic discriminability and high geometric fidelity, leading to significant improvements in cross-view point matching accuracy. This approach establishes a new paradigm for unsupervised pixel-level alignment, advancing beyond reliance on momentum-based consistency or hand-crafted geometric priors.

Technology Category

Application Category

πŸ“ Abstract
We pilot a family of stable contrastive losses for learning pixel-level representations that jointly capture semantic and geometric information. Our approach maps each pixel of an image to an overcomplete descriptor that is both view-invariant and semantically meaningful. It enables precise point-correspondence across images without requiring momentum-based teacher-student training. Two experiments in synthetic 2D and 3D environments demonstrate the properties of our loss and the resulting overcomplete representations.
Problem

Research questions and friction points this paper is trying to address.

Develop stable contrastive losses for pixel-level representation learning
Create view-invariant and semantically meaningful overcomplete descriptors
Enable precise point-correspondence without momentum-based teacher-student training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stable contrastive losses for pixel-level representations
Overcomplete descriptors for view-invariant semantics
Point-correspondence without momentum-based teacher training
πŸ”Ž Similar Papers
No similar papers found.
L
Leonid Pogorelyuk
Rensselaer Polytechnic Institute, Troy, NY , USA
N
Niels Bracher
Rensselaer Polytechnic Institute, Troy, NY , USA
A
Aaron Verkleeren
Rensselaer Polytechnic Institute, Troy, NY , USA
L
Lars KΓΌhmichel
Technical University Dortmund, Dortmund, Germany
Stefan T. Radev
Stefan T. Radev
Assistant Professor, Rensselaer Polytechnic Institute
Deep LearningBayesian StatisticsStochastic ModelsMachine LearningCognitive Modeling