CASS-RTL: Correctness-Aware Subspace Steering for RTL Generation with LLMs

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the challenge that large language models struggle to generate cycle-accurate and concurrency-correct register transfer level (RTL) code, where minor errors can lead to functional failures or security vulnerabilities. The study is the first to identify internal attention heads correlated with RTL correctness, revealing a low-dimensional correctness subspace within the model’s representation space. Building upon this insight, the authors propose a geometric-aware inference-time intervention framework that requires neither additional supervision nor model retraining and is applicable across different architectures. Evaluated on VerilogEval, the method improves pass@1/5/10 accuracy by 10%–20%, and achieves a 5% gain on CVDP, substantially enhancing RTL generation reliability while preserving inference efficiency.

📝 Abstract

Recent advances in large language models (LLMs) have enabled the automatic synthesis (generation) of register-transfer level (RTL) code from natural language instructions, offering a promising pathway to accelerate chip design. Unlike typical natural language (and software coding) tasks, LLM-based RTL code generation demands strict cycle accuracy with concurrency, where minor logical errors can render a circuit unusable or insecure. While prior work has explored hallucination mitigation via external verification, self-evaluation prompts, retrieval-augmented prompting, domain specific fine-tuning, agentic solutions, and reasoning, these approaches largely overlook the attention-oriented internal mechanisms of LLMs that may inherently correlate with RTL correctness. This work proposes CASS-RTL, a first-of-its-kind framework for discovering and leveraging LLMs' correctness-aware components to guide RTL generation toward functionally accurate outputs. We (i) identify attention heads whose activation patterns consistently differentiate correct from incorrect RTL; (ii) construct a low-dimensional subspace capturing correctness-relevant signals; and (iii) design a lightweight, geometry-aware intervention that steers the model at inference time. CASS-RTL is fully model-agnostic, requires no additional supervision or retraining, and readily integrates into existing models. Empirically, we evaluate CASS-RTL on multiple models and observe 10%-20% improvement in pass@1/5/10 accuracy on VerilogEval and 5% improvement on CVDP, demonstrating the effectiveness of our method in enhancing reliability without sacrificing model efficiency or requiring a large labeled dataset for fine-tuning.

Problem

Research questions and friction points this paper is trying to address.

RTL generation

large language models

correctness

cycle accuracy

logical errors

Innovation

Methods, ideas, or system contributions that make the work stand out.

correctness-aware

subspace steering

attention heads