🤖 AI Summary
This work proposes the first closed-loop, multimodal agent framework for CAD automation, addressing the limitations of existing open-loop, single-pass approaches that fail to support iterative interaction inherent in real-world design processes. By engaging in multi-turn interactions with an executable CAD sandbox, the framework unifies diverse tasks—including sketch-to-code, text-to-code, and interactive editing. It integrates progressive supervised fine-tuning, geometry-aware reinforcement learning, a feasibility-preserving prefix masking mechanism, and a multi-view engineering drawing synthesis pipeline. Furthermore, the study introduces CD-TR, a novel evaluation metric free from survivorship bias. Experimental results demonstrate that the proposed framework significantly outperforms current methods in code executability, geometric fidelity, and closed-loop iterative refinement capability.
📝 Abstract
Computer-Aided Design is pivotal in modern manufacturing, yet existing automated methods predominantly rely on open-loop, one-shot generation, creating a mismatch with iterative real-world practices. In this paper, we present IterCAD, a unified multimodal agent framework for closed-loop, interactive CAD generation and editing. We formulate the task as a multi-turn interaction between a multimodal agent and an executable CAD sandbox, covering three tasks: Drawing-to-Code, Text-to-Code, and Interactive Editing. To support this, we develop a data synthesis pipeline incorporating advanced industrial manufacturing features to generate standard-compliant multi-view engineering drawings, complex code-editing tasks, and high-fidelity interaction trajectories. We optimize the agent via progressive SFT followed by geometry-aware reinforcement learning with viable-prefix masking to enhance code executability and geometric fidelity. Finally, we introduce the IterCAD-Bench evaluation suite and propose the Chamfer Distance Tolerance-Recall (CD-TR) curve alongside its AUC-TR metric, establishing a survivor-bias-free standard that unifies code validity and geometric precision. Extensive experiments demonstrate that IterCAD achieves highly competitive performance across multiple benchmarks, significantly outperforming existing approaches in both code executability and geometric precision, while exhibiting superior capabilities in closed-loop iterative refinement.