UniRTL: Unifying Code and Graph for Robust RTL Representation Learning

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Existing RTL representation methods are often confined to a single modality—either code or graph structure—making it difficult to fully capture hardware semantics. This work proposes UniRTL, the first multimodal pretraining framework that unifies RTL code and control/data flow graphs (CDFGs). UniRTL achieves deep semantic integration between code and graph representations through a graph-aware tokenizer, fine-grained cross-modal masking, and a staged hierarchical alignment strategy. Evaluated on performance prediction and code retrieval tasks, UniRTL significantly outperforms existing approaches, demonstrating its effectiveness and robustness in hardware design automation.

📝 Abstract

Developing effective representations for register transfer level (RTL) designs is crucial for accelerating the hardware design workflow. Existing approaches, however, typically rely on a single data modality, either the RTL code or its associated graph-based representation, limiting the expressiveness and generalization ability of the learned representations. For RTL, the control data flow graph (CDFG) offers a comprehensive structural representation that preserves complete information, while the code modality explicitly encodes semantic and functional information. We argue that integrating these complementary modalities is essential for a thorough understanding of RTL designs. To this end, we propose UniRTL, a multimodal pretraining framework that learns unified RTL representations by jointly leveraging code and CDFG. UniRTL achieves fine-grained alignment between code and graph through mutual masked modeling and employs a hierarchical training strategy that incorporates a pretrained graph-aware tokenizer and staged alignment of text (i.e., functional summary) and code prior to graph integration. We evaluate UniRTL on two downstream tasks, performance prediction and code retrieval, under multiple settings. Experimental results show that UniRTL consistently outperforms prior methods, establishing it as a more robust and powerful foundation for advancing hardware design automation.

Problem

Research questions and friction points this paper is trying to address.

RTL representation

multimodal learning

code and graph unification

hardware design automation

representation learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal pretraining

RTL representation learning

control data flow graph