🤖 AI Summary
Existing 3D representation learning methods often rely on extrinsic geometry or high-level semantics, making it difficult to capture the intrinsic structure and manifold topology of shapes. This work proposes PRISM, a novel pretraining paradigm that, for the first time, leverages geodesic distance recovery as a self-supervised signal to learn intrinsic geometry through isometric embeddings. To address the inherent imbalance in geodesic distance distributions, the approach introduces a topology-preserving latent space constraint and a two-stage training strategy. The method demonstrates high accuracy, robustness, and efficiency in geodesic distance prediction and achieves state-of-the-art performance on downstream tasks including shape recognition, surface parameterization, and non-rigid correspondence.
📝 Abstract
Geometric analysis fundamentally distinguishes between \textit{extrinsic} and \textit{intrinsic} perspectives. The dominant paradigm in current 3D representation learning relies on either extrinsic spatial structures or high-level semantics, struggling to capture the essence of shape identity and underlying manifold topology. To bridge this gap, we introduce a novel 3D representation learning paradigm, namely \textbf{PRISM}, for \textbf{P}re-training, which learns isometric embeddings by \textbf{R}ecovering the \textbf{I}ntrinsic \textbf{S}urface geodesic \textbf{M}etric. PRISM incorporates a topology-enforcing objective that explicitly constrains the structure of latent space, alongside a specialized two-stage training recipe mitigating sample imbalance inherent in the distribution of geodesic distances. Experiments demonstrate that our approach shows satisfactory accuracy, robustness, and high efficiency in geodesic distance prediction and achieves superior performance across diverse downstream tasks, including shape recognition, surface parameterization, and non-rigid correspondence. The code will be publicly available at https://github.com/AidenZhao/PRISM.