Discovering 100+ Compiler Defects in 72 Hours via LLM-Driven Semantic Logic Recomposition

📅 2026-01-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing compiler fuzzing approaches struggle to preserve the critical semantic logic necessary to trigger deep-seated bugs, resulting in insufficient program diversity and limited defect discovery. To address this, this work proposes FeatureFuzz, the first semantics-driven compiler fuzzer that decouples semantic patterns from historical bug reports into reusable feature units—each comprising natural language descriptions and code examples—and leverages large language models to synthesize and instantiate programs guided by these semantics. Evaluated on GCC and LLVM, FeatureFuzz uncovered 106 real-world bugs within 72 hours (76 confirmed), achieving 2.78× more crashes than the best baseline tool within 24 hours, thereby substantially enhancing both semantic diversity and bug-triggering capability.

Technology Category

Application Category

📝 Abstract
Compilers constitute the foundational root-of-trust in software supply chains; however, their immense complexity inevitably conceals critical defects. Recent research has attempted to leverage historical bugs to design new mutation operators or fine-tune models to increase program diversity for compiler fuzzing.We observe, however, that bugs manifest primarily based on the semantics of input programs rather than their syntax. Unfortunately, current approaches, whether relying on syntactic mutation or general Large Language Model (LLM) fine-tuning, struggle to preserve the specific semantics found in the logic of bug-triggering programs. Consequently, these critical semantic triggers are often lost, resulting in a limitation of the diversity of generated programs. To explicitly reuse such semantics, we propose FeatureFuzz, a compiler fuzzer that combines features to generate programs. We define a feature as a decoupled primitive that encapsulates a natural language description of a bug-prone invariant, such as an out-of-bounds array access, alongside a concrete code witness of its realization. FeatureFuzz operates via a three-stage workflow: it first extracts features from historical bug reports, synthesizes coherent groups of features, and finally instantiates these groups into valid programs for compiler fuzzing. We evaluated FeatureFuzz on GCC and LLVM. Over 24-hour campaigns, FeatureFuzz uncovered 167 unique crashes, which is 2.78x more than the second-best fuzzer. Furthermore, through a 72-hour fuzzing campaign, FeatureFuzz identified 113 bugs in GCC and LLVM, 97 of which have already been confirmed by compiler developers, validating the approach's ability to stress-test modern compilers effectively.
Problem

Research questions and friction points this paper is trying to address.

compiler defects
semantic logic
fuzzing
program diversity
bug-triggering semantics
Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic logic recomposition
feature-based fuzzing
compiler testing
LLM-driven program generation
bug-triggering semantics
X
Xinabang He
State Key Laboratory for Novel Software Technology, Nanjing University
Y
Yuanwei Chen
State Key Laboratory for Novel Software Technology, Nanjing University
H
Hao Wu
State Key Laboratory for Novel Software Technology, Nanjing University
J
Jikang Zhang
Institute of Dataspace, Hefei Comprehensive National Science Center China
Zicheng Wang
Zicheng Wang
University of Sydney
computer visionAI4Science
L
Ligeng Chen
Hornor Device Co., Ltd
Junjie Peng
Junjie Peng
Shanghai University
H
Haiyang Wei
State Key Laboratory for Novel Software Technology, Nanjing University
Yi Qian
Yi Qian
University of Tulsa
cyber security and privacycomputer networks and wireless communication networks
T
Tiantai Zhang
State Key Laboratory for Novel Software Technology, Nanjing University
Linzhang Wang
Linzhang Wang
Professor of Computer Science, Nanjing University
software testinganalysisverificationmodeling
Bing Mao
Bing Mao
Computer Science, Nanjing University
software securityoperating systemdistributed system