KDH-CAD: Knowledge-data hybrid CAD learning under data scarcity

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the challenge of limited performance in deep learning for computer-aided design (CAD) due to the scarcity of real-world annotated data. The authors propose a data-efficient learning paradigm that leverages pre-trained foundation models without requiring fine-tuning. By formulating CAD learning as a knowledge completion and calibration task, they integrate structured domain knowledge—such as textbooks and tutorials—with the capabilities of foundation models through a novel knowledge-guided concept completion mechanism and a few-shot latent space calibration technique. This approach drastically reduces reliance on large-scale labeled datasets, achieving 92.6% accuracy with only 250 samples and 95.8% with 1,000 samples—performance comparable to or exceeding that of existing methods requiring an order of magnitude more data—thereby establishing a new pathway for data-efficient CAD learning.

📝 Abstract

Deep learning in computer-aided design (CAD) remains fundamentally constrained by the data scarcity challenge: authentic CAD data is difficult to collect at scale, while synthetic data may not faithfully reflect real design practice. Rather than pursuing ever-larger CAD datasets, this paper alternatively treats CAD learning as a knowledge completion and calibration problem. It introduces KDH-CAD, a knowledge-data hybrid framework that integrates pretrained knowledge in foundation models, structured domain knowledge from textbooks/tutorials, and a very small amount of labeled CAD data. Domain knowledge is used to elicit and complete CAD-relevant concepts that are weakly expressed or under-represented in pretrained foundation models, while labeled CAD data calibrates these concepts in the latent space to account for task-specific geometric variability, without fine-tuning the foundation model. Experiments on real-world mechanical part classification show that KDH-CAD achieves strong performance in low-data regimes, reaching 92.6\% accuracy with only 250 training samples, 95.8\% with 1,000 samples, and continuing to improve with additional data. This matches or exceeds state-of-the-art performance that typically requires an order of magnitude more data. These results suggest that combining pretrained foundation models with structured domain knowledge can substantially reduce reliance on large-scale CAD datasets, providing a principled and practical direction for data-efficient CAD learning.

Problem

Research questions and friction points this paper is trying to address.

data scarcity

computer-aided design

deep learning

CAD data

knowledge-data hybrid

Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge-data hybrid learning

foundation models

domain knowledge integration