QiMeng-CRUX: Narrowing the Gap between Natural Language and Verilog via Core Refined Understanding eXpression

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing natural language (NL)–to–Verilog synthesis methods suffer from ambiguous and unstructured NL inputs, resulting in a large semantic gap and low generation accuracy. Method: We propose CRUX (Core Refined Understanding eXpression), a structured intermediate representation that explicitly models user intent as a hardware semantic graph to bridge the semantic gap between NL and Verilog. We further design a two-stage joint training framework: Stage I learns the NL→CRUX mapping, while Stage II optimizes CRUX→Verilog generation and supports cross-model transfer. Contribution/Results: Our approach achieves state-of-the-art performance among general-purpose large language models on multiple Verilog synthesis benchmarks, with particularly significant improvements on complex circuit design tasks. Moreover, CRUX serves as a plug-and-play enhancement module for other code generation models, demonstrating strong generalizability and practical utility.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have shown promising capabilities in hardware description language (HDL) generation. However, existing approaches often rely on free-form natural language descriptions that are often ambiguous, redundant, and unstructured, which poses significant challenges for downstream Verilog code generation. We treat hardware code generation as a complex transformation from an open-ended natural language space to a domain-specific, highly constrained target space. To bridge this gap, we introduce Core Refined Understanding eXpression (CRUX), a structured intermediate space that captures the essential semantics of user intent while organizing the expression for precise Verilog code generation. We further design a two-stage training framework, comprising Joint Expression Modeling and Dual-Space Optimization, to enhance the quality of both CRUX and Verilog code. Experiments across multiple Verilog generation benchmarks demonstrate that our model, CRUX-V, achieves state-of-the-art performance among general models, particularly under challenging design tasks. Furthermore, the CRUX space proves transferable and beneficial when used as input prompts for other code models, highlighting its effectiveness in narrowing the gap between free-form natural language descriptions and precise Verilog generation.
Problem

Research questions and friction points this paper is trying to address.

Bridging ambiguous natural language and precise Verilog generation
Creating structured intermediate representation for hardware semantics
Improving code quality via dual-space optimization framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing structured CRUX intermediate representation space
Implementing two-stage training with joint modeling
Optimizing dual-space for Verilog code quality
🔎 Similar Papers
No similar papers found.
L
Lei Huang
State Key Lab of Processors, Institute of Computing Technology, CAS
R
Rui Zhang
State Key Lab of Processors, Institute of Computing Technology, CAS
Jiaming Guo
Jiaming Guo
Institute of Computing Technology, Chinese Academy of Sciences
Artificial intelligenceReinforcement Learning
Y
Yang Zhang
State Key Lab of Processors, Institute of Computing Technology, CAS
D
Di Huang
State Key Lab of Processors, Institute of Computing Technology, CAS
S
Shuyao Cheng
State Key Lab of Processors, Institute of Computing Technology, CAS
P
Pengwei Jin
State Key Lab of Processors, Institute of Computing Technology, CAS
Chongxiao Li
Chongxiao Li
ICT, CAS
Computer Architecture
Z
Zidong Du
State Key Lab of Processors, Institute of Computing Technology, CAS
X
Xing Hu
State Key Lab of Processors, Institute of Computing Technology, CAS
Q
Qi Guo
State Key Lab of Processors, Institute of Computing Technology, CAS
Yunji Chen
Yunji Chen
Institute of Computing Technology, Chinese Academy of Sciences
processor architecturemicroarchitecturemachine learning