🤖 AI Summary
Existing RTL code generation methods heavily rely on commercial LLMs (e.g., GPT series), posing significant privacy risks, limited customization, and subpar performance from open-source models. This work addresses these limitations by proposing the first open-source RTL generation framework tailored for hardware design automation. Our approach comprises three key components: (1) constructing the first high-quality, open-source RTL code dataset; (2) developing a domain-specific LLM based on a lightweight 7B-parameter architecture, enhanced via RTL-oriented fine-tuning and 4-bit quantization—yielding a compact 4 GB model deployable locally on a single machine; and (3) achieving state-of-the-art accuracy on benchmarks including VerilogEval, outperforming both GPT-3.5 across all metrics and GPT-4 on this specific task. By jointly optimizing performance, privacy preservation, and practical deployability, our framework establishes a trustworthy, open foundation for AI-assisted hardware design.
📝 Abstract
The automatic generation of RTL code (e.g., Verilog) using natural language instructions and large language models (LLMs) has attracted significant research interest recently. However, most existing approaches heavily rely on commercial LLMs, such as ChatGPT, while open-source LLMs tailored for this specific design generation task exhibit notably inferior performance. The absence of high-quality open-source solutions restricts the flexibility and data privacy of this emerging technique. In this study, we present a new customized LLM solution with a modest parameter count of only 7B, achieving better performance than GPT-3.5 on all representative benchmarks for RTL code generation. Especially, it outperforms GPT-4 in VerilogEval Machine benchmark. This remarkable balance between accuracy and efficiency is made possible by leveraging our new RTL code dataset and a customized LLM algorithm, both of which have been made fully open-source. Furthermore, we have successfully quantized our LLM to 4-bit with a total size of 4 GB, enabling it to function on a single laptop with only slight performance degradation. This efficiency allows the RTL generator to serve as a local assistant for engineers, ensuring all design privacy concerns are addressed.