🤖 AI Summary
This work addresses the challenge of limited performance in deep learning for computer-aided design (CAD) due to the scarcity of real-world annotated data. The authors propose a data-efficient learning paradigm that leverages pre-trained foundation models without requiring fine-tuning. By formulating CAD learning as a knowledge completion and calibration task, they integrate structured domain knowledge—such as textbooks and tutorials—with the capabilities of foundation models through a novel knowledge-guided concept completion mechanism and a few-shot latent space calibration technique. This approach drastically reduces reliance on large-scale labeled datasets, achieving 92.6% accuracy with only 250 samples and 95.8% with 1,000 samples—performance comparable to or exceeding that of existing methods requiring an order of magnitude more data—thereby establishing a new pathway for data-efficient CAD learning.
📝 Abstract
Deep learning in computer-aided design (CAD) remains fundamentally constrained by the data scarcity challenge: authentic CAD data is difficult to collect at scale, while synthetic data may not faithfully reflect real design practice. Rather than pursuing ever-larger CAD datasets, this paper alternatively treats CAD learning as a knowledge completion and calibration problem. It introduces KDH-CAD, a knowledge-data hybrid framework that integrates pretrained knowledge in foundation models, structured domain knowledge from textbooks/tutorials, and a very small amount of labeled CAD data. Domain knowledge is used to elicit and complete CAD-relevant concepts that are weakly expressed or under-represented in pretrained foundation models, while labeled CAD data calibrates these concepts in the latent space to account for task-specific geometric variability, without fine-tuning the foundation model. Experiments on real-world mechanical part classification show that KDH-CAD achieves strong performance in low-data regimes, reaching 92.6\% accuracy with only 250 training samples, 95.8\% with 1,000 samples, and continuing to improve with additional data. This matches or exceeds state-of-the-art performance that typically requires an order of magnitude more data. These results suggest that combining pretrained foundation models with structured domain knowledge can substantially reduce reliance on large-scale CAD datasets, providing a principled and practical direction for data-efficient CAD learning.