QiMeng-CRUX: Narrowing the Gap between Natural Language and Verilog via Core Refined Understanding eXpression

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

Existing natural language (NL)–to–Verilog synthesis methods suffer from ambiguous and unstructured NL inputs, resulting in a large semantic gap and low generation accuracy. Method: We propose CRUX (Core Refined Understanding eXpression), a structured intermediate representation that explicitly models user intent as a hardware semantic graph to bridge the semantic gap between NL and Verilog. We further design a two-stage joint training framework: Stage I learns the NL→CRUX mapping, while Stage II optimizes CRUX→Verilog generation and supports cross-model transfer. Contribution/Results: Our approach achieves state-of-the-art performance among general-purpose large language models on multiple Verilog synthesis benchmarks, with particularly significant improvements on complex circuit design tasks. Moreover, CRUX serves as a plug-and-play enhancement module for other code generation models, demonstrating strong generalizability and practical utility.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have shown promising capabilities in hardware description language (HDL) generation. However, existing approaches often rely on free-form natural language descriptions that are often ambiguous, redundant, and unstructured, which poses significant challenges for downstream Verilog code generation. We treat hardware code generation as a complex transformation from an open-ended natural language space to a domain-specific, highly constrained target space. To bridge this gap, we introduce Core Refined Understanding eXpression (CRUX), a structured intermediate space that captures the essential semantics of user intent while organizing the expression for precise Verilog code generation. We further design a two-stage training framework, comprising Joint Expression Modeling and Dual-Space Optimization, to enhance the quality of both CRUX and Verilog code. Experiments across multiple Verilog generation benchmarks demonstrate that our model, CRUX-V, achieves state-of-the-art performance among general models, particularly under challenging design tasks. Furthermore, the CRUX space proves transferable and beneficial when used as input prompts for other code models, highlighting its effectiveness in narrowing the gap between free-form natural language descriptions and precise Verilog generation.

Problem

Research questions and friction points this paper is trying to address.

Bridging ambiguous natural language and precise Verilog generation

Creating structured intermediate representation for hardware semantics

Improving code quality via dual-space optimization framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing structured CRUX intermediate representation space

Implementing two-stage training with joint modeling

Optimizing dual-space for Verilog code quality

🔎 Similar Papers

HDLCopilot: Natural Language Exploration of Hardware Designs and Libraries