🤖 AI Summary
Existing natural language (NL)–to–Verilog synthesis methods suffer from ambiguous and unstructured NL inputs, resulting in a large semantic gap and low generation accuracy. Method: We propose CRUX (Core Refined Understanding eXpression), a structured intermediate representation that explicitly models user intent as a hardware semantic graph to bridge the semantic gap between NL and Verilog. We further design a two-stage joint training framework: Stage I learns the NL→CRUX mapping, while Stage II optimizes CRUX→Verilog generation and supports cross-model transfer. Contribution/Results: Our approach achieves state-of-the-art performance among general-purpose large language models on multiple Verilog synthesis benchmarks, with particularly significant improvements on complex circuit design tasks. Moreover, CRUX serves as a plug-and-play enhancement module for other code generation models, demonstrating strong generalizability and practical utility.
📝 Abstract
Large language models (LLMs) have shown promising capabilities in hardware description language (HDL) generation. However, existing approaches often rely on free-form natural language descriptions that are often ambiguous, redundant, and unstructured, which poses significant challenges for downstream Verilog code generation. We treat hardware code generation as a complex transformation from an open-ended natural language space to a domain-specific, highly constrained target space. To bridge this gap, we introduce Core Refined Understanding eXpression (CRUX), a structured intermediate space that captures the essential semantics of user intent while organizing the expression for precise Verilog code generation. We further design a two-stage training framework, comprising Joint Expression Modeling and Dual-Space Optimization, to enhance the quality of both CRUX and Verilog code. Experiments across multiple Verilog generation benchmarks demonstrate that our model, CRUX-V, achieves state-of-the-art performance among general models, particularly under challenging design tasks. Furthermore, the CRUX space proves transferable and beneficial when used as input prompts for other code models, highlighting its effectiveness in narrowing the gap between free-form natural language descriptions and precise Verilog generation.