Structured Testbench Generation for LLM-Driven HDL Design and Verification-Oriented Data Curation

πŸ“… 2026-06-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses critical bottlenecks in LLM-driven RTL design flowsβ€”namely, stochastic outputs, high computational cost, poor reproducibility, and insufficient coverage in automated testbench generation. To overcome these challenges, the authors propose the Structured Testbench Generation (STG) framework, which introduces a structured approach that leverages the inherent architecture of hardware designs to produce deterministic testbenches. STG integrates CPU-efficient data curation and inference-time pruning mechanisms during testing. The proposed method substantially enhances verification efficiency and reliability: it accelerates testbench generation by 720Γ—, achieves higher compilation success rates and coverage, and reduces misjudgment rates. Moreover, data curation is sped up by 11Γ— with a 127Γ— reduction in energy consumption. Evaluated across multiple benchmarks, the model attains state-of-the-art performance.
πŸ“ Abstract
Automated testbench generation has become a critical bottleneck in large language model (LLM)-driven Register Transfer Level (RTL) workflows, where large numbers of candidate designs must be verified rapidly and reliably. Existing prompt-based approaches treat testbench generation as unconstrained code synthesis, yielding stochastic outputs with high token cost, low reproducibility, and insufficient coverage. To address this gap, we present STG, a Structured Testbench Generation framework that exploits the inherent structure of hardware designs to generate deterministic testbenches. As a direct verification tool, STG runs 720x faster than an iterative LLM-based testbench generation flow and higher rate of successful compilation, achieves higher coverage, and reduces false-pass verdicts on incorrect DUTs. STG also helps identify errors in RTL generation benchmarks by exposing faulty benchmark testbenches. As a data curation engine, it is 11x faster than LLM-based filtering on a single CPU core with 127x less energy, and the resulting distilled models provide state-of-the-art performance in our multi-benchmark evaluation. As a test-time scaling oracle, it reduces node count by 14-47\%. Our models are available at https://huggingface.co/collections/AS-SiliconMind/siliconmind-v12.
Problem

Research questions and friction points this paper is trying to address.

testbench generation
LLM-driven HDL design
verification
structured generation
data curation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured Testbench Generation
LLM-driven HDL Verification
Deterministic Testbench Synthesis
Verification-Oriented Data Curation
Test-Time Scaling Oracle
πŸ”Ž Similar Papers