HFBRI-MAE: Handcrafted Feature Based Rotation-Invariant Masked Autoencoder for 3D Point Cloud Analysis

📅 2025-04-19

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Existing mask autoencoder (MAE)-based self-supervised methods for 3D point clouds are sensitive to arbitrary rotations, lacking rotational invariance and thus suffering performance degradation in real-world scenarios. To address this, we propose the first rotation-invariant MAE framework. Our method introduces a dual-embedding mechanism that fuses handcrafted geometric features—including FPFH descriptors and normal distribution statistics—to jointly encode local and global rotation-invariant representations. We further redefine the reconstruction target as point clouds aligned to a canonical coordinate system, eliminating rotational ambiguity at the source. Additionally, we design a rotation-invariant positional encoding scheme. Extensive experiments on ModelNet40, ScanObjectNN, and ShapeNetPart demonstrate that our approach consistently surpasses state-of-the-art methods across classification, part segmentation, and few-shot learning tasks, significantly enhancing model robustness and generalization capability.

Technology Category

Application Category

📝 Abstract

Self-supervised learning (SSL) has demonstrated remarkable success in 3D point cloud analysis, particularly through masked autoencoders (MAEs). However, existing MAE-based methods lack rotation invariance, leading to significant performance degradation when processing arbitrarily rotated point clouds in real-world scenarios. To address this limitation, we introduce Handcrafted Feature-Based Rotation-Invariant Masked Autoencoder (HFBRI-MAE), a novel framework that refines the MAE design with rotation-invariant handcrafted features to ensure stable feature learning across different orientations. By leveraging both rotation-invariant local and global features for token embedding and position embedding, HFBRI-MAE effectively eliminates rotational dependencies while preserving rich geometric structures. Additionally, we redefine the reconstruction target to a canonically aligned version of the input, mitigating rotational ambiguities. Extensive experiments on ModelNet40, ScanObjectNN, and ShapeNetPart demonstrate that HFBRI-MAE consistently outperforms existing methods in object classification, segmentation, and few-shot learning, highlighting its robustness and strong generalization ability in real-world 3D applications.

Problem

Research questions and friction points this paper is trying to address.

Lack of rotation invariance in existing MAE-based 3D point cloud methods

Performance drop with arbitrarily rotated real-world point clouds

Need for stable feature learning across different orientations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rotation-invariant handcrafted features in MAE

Combined local and global feature embeddings

Reconstruction target aligned canonically

🔎 Similar Papers

RI-MAE: Rotation-Invariant Masked AutoEncoders for Self-Supervised Point Cloud Representation Learning