Synergistic Directed Execution and LLM-Driven Analysis for Zero-Day AI-Generated Malware Detection

📅 2026-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of detecting polymorphic, metamorphic, and context-aware zero-day malware generated by large language models (LLMs), which evades traditional signature-based or shallow heuristic detection methods. The authors propose a hybrid analysis framework that integrates concolic execution, LLM-guided path prioritization, and deep learning–based vulnerability classification. Notably, this is the first approach to synergistically combine LLMs with symbolic execution for malware detection, featuring a provably correct detection algorithm and a reinforcement learning–based feedback mechanism to optimize path exploration. Evaluated on a new benchmark comprising 2,500 LLM-generated malware samples, the method achieves a detection accuracy of 97.5%, outperforming baseline tools such as ClamAV and YARA by 8.4 to 52.2 percentage points, while improving path exploration efficiency by 73.2%.

Technology Category

Application Category

📝 Abstract
The weaponization of LLMs for automated malware generation poses an existential threat to conventional detection paradigms. AI-generated malware exhibits polymorphic, metamorphic, and context-aware evasion capabilities that render signature-based and shallow heuristic defenses obsolete. This paper introduces a novel hybrid analysis framework that synergistically combines \emph{concolic execution} with \emph{LLM-augmented path prioritization} and \emph{deep-learning-based vulnerability classification} to detect zero-day AI-generated malware with provable guarantees. We formalize the detection problem within a first-order temporal logic over program execution traces, define a lattice-theoretic abstraction for path constraint spaces, and prove both the \emph{soundness} and \emph{relative completeness} of our detection algorithm, assuming classifier correctness. The framework introduces three novel algorithms: (i) an LLM-guided concolic exploration strategy that reduces the average number of explored paths by 73.2\% compared to depth-first search while maintaining equivalent malicious-path coverage; (ii) a transformer-based path-constraint classifier trained on symbolic execution traces; and (iii) a feedback loop that iteratively refines the LLM's prioritization policy using reinforcement learning from detection outcomes. We provide a comprehensive implementation built upon \texttt{angr} 9.2, \texttt{Z3} 4.12, Hugging Face Transformers 4.38, and PyTorch 2.2, with configuration details enabling reproducibility. Experimental evaluation on the EMBER, Malimg, SOREL-20M, and a novel AI-Gen-Malware benchmark comprising 2{,}500 LLM-synthesized samples demonstrates that achieves 98.7\% accuracy on conventional malware and 97.5\% accuracy on AI-generated threats, outperforming ClamAV, YARA, MalConv, and EMBER-GBDT baselines by margins of 8.4--52.2 percentage points on AI-generated samples.
Problem

Research questions and friction points this paper is trying to address.

AI-generated malware
zero-day detection
LLM weaponization
evasion capabilities
malware detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

concolic execution
LLM-guided path prioritization
zero-day AI-generated malware
transformer-based classifier
reinforcement learning feedback loop
🔎 Similar Papers
No similar papers found.