🤖 AI Summary
This work addresses the challenges of contactless material identification in unconstrained environments, where geometric variations and single-modality ambiguity hinder performance. To overcome these limitations, the authors propose a cross-modal subtraction-based decoupling framework that fuses millimeter-wave and acoustic sensing. The approach employs intra-sample cross-modal subtraction to eliminate shared geometric information while preserving intrinsic material signatures, and further leverages inter-sample contrastive learning to correct residual modality misalignment. Additionally, a pairwise adaptive strategy is introduced to enable few-shot generalization across devices. Evaluated on 20 diverse materials, the method achieves a recognition accuracy of 95.2% and demonstrates significant superiority over single-modality baselines under unseen geometric conditions.
📝 Abstract
Non-contact material identification enables adaptive interaction for embodied intelligence yet faces challenges from geometry-induced variations (e.g., orientation, shape, distance) and single-modality ambiguities. In this paper, we present GaMi, a multimodal material identification system integrating mmWave and acoustic sensing to robustly operate under unconstrained geometric conditions. By leveraging the insight of shared geometric consistency between co-located bimodal sensors, GaMi employs an intra-sample cross-modal subtractive disentanglement framework. By semantically aligning modalities and subtracting the shared geometric context, it isolates intrinsic material features. Furthermore, GaMi incorporates inter-sample contrastive learning to correct the residual interference caused by cross-modal misalignment. Additionally, a pairing-based adaptation strategy between two modalities enables few-shot generalization across devices. Extensive evaluations on 20 materials show that GaMi achieves 95.2% accuracy, outperforming single-modality baselines across unseen geometric conditions.