🤖 AI Summary
This paper systematically evaluates the capabilities and limitations of large language models (LLMs) in automated exploit generation (AEG). Addressing critical issues in existing benchmarks—such as dataset bias and narrow evaluation dimensions—the authors propose the first dual-dimensional assessment framework jointly measuring *collaborativeness* (e.g., prompt responsiveness, debugging assistance) and *technical capability* (e.g., exploit correctness, reliability). They introduce a de-biased benchmark grounded in five re-engineered security experiments and design a reproducible, multi-turn LLM-driven attacker prompting paradigm. Experiments span leading closed- and open-weight models—including GPT-4, GPT-4o, and Llama3—employing structured prompting, code semantic reconstruction, and rigorous exploit validation. Results show that while GPT-4 and GPT-4o exhibit strong collaborativeness, they fail to produce functionally valid exploits; Llama3 demonstrates superior robustness against exploitation attempts. Crucially, no model passes strict exploit validity verification; GPT-4o achieves the lowest error rate, underscoring that LLM-driven AEG remains at a nascent, foundational stage.
📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities in code-related tasks, raising concerns about their potential for automated exploit generation (AEG). This paper presents the first systematic study on LLMs' effectiveness in AEG, evaluating both their cooperativeness and technical proficiency. To mitigate dataset bias, we introduce a benchmark with refactored versions of five software security labs. Additionally, we design an LLM-based attacker to systematically prompt LLMs for exploit generation. Our experiments reveal that GPT-4 and GPT-4o exhibit high cooperativeness, comparable to uncensored models, while Llama3 is the most resistant. However, no model successfully generates exploits for refactored labs, though GPT-4o's minimal errors highlight the potential for LLM-driven AEG advancements.