🤖 AI Summary
This work addresses the lack of precise geometric validation in existing text-to-CAD generation methods, which often fail to correct dimensional inaccuracies. We propose a multi-agent collaborative framework that integrates inner-loop iterative correction of code execution errors with an outer-loop refinement combining exact geometric metrics from the OpenCASCADE kernel and global shape assessment via a vision-language model. Our approach uniquely unifies procedural geometric verification with visual-semantic judgment, yielding a retrieval-augmented generation (RAG) system that requires no fine-tuning and naturally evolves with CAD libraries. Evaluated on a newly curated benchmark of 100 multi-difficulty examples, our method achieves a 100% execution success rate, improves median IoU from 0.8085 to 0.9629, and reduces average Chamfer Distance from 28.37 to 0.74.
📝 Abstract
Existing methods for text-to-CAD generation either operate in a single pass with no geometric verification or rely on lossy visual feedback that cannot resolve dimensional errors. We present CADSmith, a multi-agent pipeline that generates CadQuery code from natural language. It then undergoes an iterative refinement process through two nested correction loops: an inner loop that resolves execution errors and an outer loop grounded in programmatic geometric validation. The outer loop combines exact measurements from the OpenCASCADE kernel (bounding box dimensions, volume, solid validity) with holistic visual assessment from an independent vision-language model Judge. This provides both the numerical precision and the high-level shape awareness needed to converge on the correct geometry. The system uses retrieval-augmented generation over API documentation rather than fine-tuning, maintaining a current database as the underlying CAD library evolves. We evaluate on a custom benchmark of 100 prompts in three difficulty tiers (T1 through T3) with three ablation configurations. Against a zero-shot baseline, CADSmith achieves a 100% execution rate (up from 95%), improves the median F1 score from 0.9707 to 0.9846, the median IoU from 0.8085 to 0.9629, and reduces the mean Chamfer Distance from 28.37 to 0.74, demonstrating that closed-loop refinement with programmatic geometric feedback substantially improves the quality and reliability of LLM-generated CAD models.