Exploiting Point-Language Models with Dual-Prompts for 3D Anomaly Detection

📅 2025-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high memory overhead and poor generalization of single-class, single-model approaches in industrial 3D point cloud anomaly detection, this paper proposes PLANE—the first single-model, multi-class general-purpose framework. Methodologically, PLANE introduces (1) a text–point-cloud dual-modality dynamic–static fusion prompting mechanism; (2) a Dynamic Prompt Construction Module (DPCM) for class-adaptive prompt generation; and (3) Ano3D, a pseudo-3D anomaly synthesis strategy enabling unsupervised cross-class transfer of pre-trained Point Language Models (PLMs). Evaluated on Anomaly-ShapeNet and Real3D-AD benchmarks, PLANE achieves improvements of 8.7%/17% and 4.3%/4.1% in anomaly detection and localization performance, respectively—significantly surpassing state-of-the-art single-class methods. This work is the first to empirically validate the effectiveness and scalability of point language models for multi-class, unsupervised 3D anomaly detection.

Technology Category

Application Category

📝 Abstract
Anomaly detection (AD) in 3D point clouds is crucial in a wide range of industrial applications, especially in various forms of precision manufacturing. Considering the industrial demand for reliable 3D AD, several methods have been developed. However, most of these approaches typically require training separate models for each category, which is memory-intensive and lacks flexibility. In this paper, we propose a novel Point-Language model with dual-prompts for 3D ANomaly dEtection (PLANE). The approach leverages multi-modal prompts to extend the strong generalization capabilities of pre-trained Point-Language Models (PLMs) to the domain of 3D point cloud AD, achieving impressive detection performance across multiple categories using a single model. Specifically, we propose a dual-prompt learning method, incorporating both text and point cloud prompts. The method utilizes a dynamic prompt creator module (DPCM) to produce sample-specific dynamic prompts, which are then integrated with class-specific static prompts for each modality, effectively driving the PLMs. Additionally, based on the characteristics of point cloud data, we propose a pseudo 3D anomaly generation method (Ano3D) to improve the model's detection capabilities in an unsupervised setting. Experimental results demonstrate that the proposed method, which is under the multi-class-one-model paradigm, achieves a +8.7%/+17% gain on anomaly detection and localization performance as compared to the state-of-the-art one-class-one-model methods for the Anomaly-ShapeNet dataset, and obtains +4.3%/+4.1% gain for the Real3D-AD dataset. Code will be available upon publication.
Problem

Research questions and friction points this paper is trying to address.

Enhances 3D anomaly detection in point clouds
Reduces memory use with single multi-category model
Improves performance via dual-prompt learning technique
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-prompt learning enhances 3D detection
Dynamic prompt creator optimizes model flexibility
Pseudo 3D anomaly generation boosts unsupervised capabilities
Jiaxiang Wang
Jiaxiang Wang
King's College London
semantic communicationsgenerative aimachine learningwireless communicationinformation theory
H
Haote Xu
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen, 361005, Fujian, China; School of Informatics, Xiamen University, Xiamen, 361005, Fujian, China
X
Xiaolu Chen
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen, 361005, Fujian, China; School of Informatics, Xiamen University, Xiamen, 361005, Fujian, China
H
Haodi Xu
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen, 361005, Fujian, China; School of Informatics, Xiamen University, Xiamen, 361005, Fujian, China
Y
Yue Huang
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen, 361005, Fujian, China; School of Informatics, Xiamen University, Xiamen, 361005, Fujian, China
Xinghao Ding
Xinghao Ding
Unknown affiliation
X
Xiaotong Tu
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen, 361005, Fujian, China; School of Informatics, Xiamen University, Xiamen, 361005, Fujian, China