๐ค AI Summary
Existing implicit neural representation (INR) shape similarity measures suffer from three key limitations: (1) architectural incompatibility across diverse INR backbones (e.g., octrees, tri-planes, hash grids); (2) functional incompatibility across implicit representations (e.g., signed distance functions, occupancy fields); and (3) reliance on MLP-specific assumptions or costly geometric reconstruction (e.g., point cloud or multi-view rendering), introducing significant overhead. This paper proposes the first architecture-agnostic and function-agnostic INR shape similarity learning framework. Our method leverages geometry-aware feature distillation and normalized embedding space alignment to construct a unified deep metric learning modelโoperating directly in the implicit representation space without explicit geometric reconstruction. It enables cross-paradigm matching among heterogeneous INRs. Extensive experiments on multiple INR benchmarks demonstrate substantial improvements in retrieval accuracy over state-of-the-art baselines. Notably, our approach achieves higher precision than reconstruction-based methods while eliminating reconstruction latency and computational overhead.
๐ Abstract
Implicit neural representations (INRs) have become an important method for encoding various data types, such as 3D objects or scenes, images, and videos. They have proven to be particularly effective at representing 3D content, e.g., 3D scene reconstruction from 2D images, novel 3D content creation, as well as the representation, interpolation, and completion of 3D shapes. With the widespread generation of 3D data in an INR format, there is a need to support effective organization and retrieval of INRs saved in a data store. A key aspect of retrieval and clustering of INRs in a data store is the formulation of similarity between INRs that would, for example, enable retrieval of similar INRs using a query INR. In this work, we propose INRet, a method for determining similarity between INRs that represent shapes, thus enabling accurate retrieval of similar shape INRs from an INR data store. INRet flexibly supports different INR architectures such as INRs with octree grids, triplanes, and hash grids, as well as different implicit functions including signed/unsigned distance function and occupancy field. We demonstrate that our method is more general and accurate than the existing INR retrieval method, which only supports simple MLP INRs and requires the same architecture between the query and stored INRs. Furthermore, compared to converting INRs to other representations (e.g., point clouds or multi-view images) for 3D shape retrieval, INRet achieves higher accuracy while avoiding the conversion overhead.