LLMs as Firmware Experts: A Runtime-Grown Tree-of-Agents Framework

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address performance degradation of large language models (LLMs) in large-scale firmware vulnerability detection—caused by firmware binary heterogeneity and complex cross-file dependencies—this paper proposes FIRMHIVE, the first large-scale firmware security analysis framework based on autonomous agent swarms. Its core contributions are: (1) modeling “delegation” as executable primitives for agents; (2) constructing a runtime Agent Tree (ToA) to enable decentralized, dynamically scalable multi-agent coordination; and (3) integrating recursive task decomposition with firmware-level cross-file dependency reasoning. Evaluated on real-world firmware images, FIRMHIVE detects 1.5× more vulnerabilities than state-of-the-art tools (1,802 total), increases actionable alerts by 5.6×, and achieves 71% precision.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) and their agent systems have recently demonstrated strong potential in automating code reasoning and vulnerability detection. However, when applied to large-scale firmware, their performance degrades due to the binary nature of firmware, complex dependency structures, and heterogeneous components. To address this challenge, this paper presents FIRMHIVE, a recursive agent hive that enables LLMs to act as autonomous firmware security analysts. FIRMHIVE introduces two key mechanisms: (1) transforming delegation into a per-agent, executable primitive and (2) constructing a runtime Tree of Agents (ToA) for decentralized coordination. We evaluate FIRMHIVE using real-world firmware images obtained from publicly available datasets, covering five representative security analysis tasks. Compared with existing LLM-agent baselines, FIRMHIVE performs deeper (about 16x more reasoning steps) and broader (about 2.3x more files inspected) cross-file exploration, resulting in about 5.6x more alerts per firmware. Compared to state-of-the-art (SOTA) security tools, FIRMHIVE identifies about 1.5x more vulnerabilities (1,802 total) and achieves 71% precision, representing significant improvements in both yield and fidelity.

Problem

Research questions and friction points this paper is trying to address.

Addresses performance degradation of LLMs in large-scale firmware analysis

Solves challenges from binary nature and complex dependencies in firmware

Improves vulnerability detection breadth and depth in security analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Recursive agent hive for autonomous firmware analysis

Transforms delegation into executable per-agent primitive

Constructs runtime Tree of Agents for decentralized coordination

🔎 Similar Papers

No similar papers found.

Authors to Follow