MARVEL: Multi-Agent RTL Vulnerability Extraction using Large Language Models

📅 2025-05-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low efficiency and poor accuracy of vulnerability detection in RTL hardware security verification, this paper proposes an end-to-end vulnerability mining framework based on a multi-agent large language model (LLM). The framework introduces a novel unified multi-agent architecture: a Supervisor Agent generates verification strategies according to security specifications, while coordinated Executor Agents orchestrate heterogeneous tools—including formal verification, RTL linting, functional simulation, static analysis, and LLM-based reasoning—to enable task decomposition, tool interoperability, and result fusion. By overcoming cognitive and technical limitations inherent in single-model or single-tool approaches, the framework significantly enhances vulnerability identification capability. Evaluated on the OpenTitan contest SoC, it detects 48 issues, of which 20 are confirmed as genuine security vulnerabilities—demonstrating substantially higher detection rate and precision than baseline methods.

Technology Category

Application Category

📝 Abstract
Hardware security verification is a challenging and time-consuming task. For this purpose, design engineers may utilize tools such as formal verification, linters, and functional simulation tests, coupled with analysis and a deep understanding of the hardware design being inspected. Large Language Models (LLMs) have been used to assist during this task, either directly or in conjunction with existing tools. We improve the state of the art by proposing MARVEL, a multi-agent LLM framework for a unified approach to decision-making, tool use, and reasoning. MARVEL mimics the cognitive process of a designer looking for security vulnerabilities in RTL code. It consists of a supervisor agent that devises the security policy of the system-on-chips (SoCs) using its security documentation. It delegates tasks to validate the security policy to individual executor agents. Each executor agent carries out its assigned task using a particular strategy. Each executor agent may use one or more tools to identify potential security bugs in the design and send the results back to the supervisor agent for further analysis and confirmation. MARVEL includes executor agents that leverage formal tools, linters, simulation tests, LLM-based detection schemes, and static analysis-based checks. We test our approach on a known buggy SoC based on OpenTitan from the Hack@DATE competition. We find that 20 of the 48 issues reported by MARVEL pose security vulnerabilities.
Problem

Research questions and friction points this paper is trying to address.

Automating RTL vulnerability detection using multi-agent LLMs
Unifying decision-making and tool use for hardware security verification
Enhancing accuracy in identifying security bugs in SoC designs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LLM framework for RTL vulnerability extraction
Supervisor agent delegates tasks to executor agents
Combines formal tools, linters, and LLM-based detection
🔎 Similar Papers
No similar papers found.