APPATCH: Automated Adaptive Prompting Large Language Models for Real-World Software Vulnerability Patching

📅 2024-08-24

📈 Citations: 2

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Automated repair of real-world software vulnerabilities remains challenging due to reliance on test suites, exploit samples, or supervised model fine-tuning. Method: This paper proposes a zero-shot, zero-training, LLM-driven patch generation approach that requires neither test cases nor vulnerability exploits and performs no model adaptation. It introduces a vulnerability-semantic-driven adaptive prompting framework integrating static semantic analysis with context-aware dynamic prompt generation and rewriting to guide pre-trained LLMs in precisely understanding defect semantics and generating correct patches. Contribution/Results: The method breaks from conventional test-feedback- or fine-tuning-dependent paradigms, achieving high-accuracy vulnerability repair via prompt engineering alone. Evaluated on 97 zero-day and 20 known vulnerabilities, it improves F1-score by 28.33% and recall by 182.26% over the best baseline, significantly outperforming existing LLM-based prompting methods and non-LLM repair tools.

Technology Category

Application Category

📝 Abstract

Timely and effective vulnerability patching is essential for cybersecurity defense, for which various approaches have been proposed yet still struggle to generate valid and correct patches for real-world vulnerabilities. In this paper, we leverage the power and merits of pre-trained language language models (LLMs) to enable automated vulnerability patching using no test input/exploit evidence and without model training/fine-tuning. To elicit LLMs to effectively reason about vulnerable code behaviors, which is essential for quality patch generation, we introduce vulnerability semantics reasoning and adaptive prompting on LLMs and instantiate the methodology as APPATCH, an automated LLM-based patching system. Our evaluation of APPATCH on 97 zero-day vulnerabilities and 20 existing vulnerabilities demonstrates its superior performance to both existing prompting methods and state-of-the-art non-LLM-based techniques (by up to 28.33% in F1 and 182.26% in recall over the best baseline). Through APPATCH, we demonstrate what helps for LLM-based patching and how, as well as discussing what still lacks and why.

Problem

Research questions and friction points this paper is trying to address.

Automate vulnerability patching using LLMs without training

Improve patch quality via vulnerability semantics reasoning

Enhance performance over existing methods and non-LLM techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages pre-trained LLMs without training

Introduces vulnerability semantics reasoning

Uses adaptive prompting for patch generation

🔎 Similar Papers

No similar papers found.