🤖 AI Summary
This study addresses the challenge of automatically and precisely mapping National Vulnerability Database (NVD) entries to their corresponding vulnerability-fixing commits (VFCs). To mitigate cross-modal matching bias caused by semantic sparsity in commit messages—a limitation of conventional approaches—we propose a large language model (LLM)-based semantic enhancement framework. Our method leverages LLMs to generate vulnerability-enriched commit summaries, serving as a semantic bridge between natural-language NVD descriptions and code changes, and integrates text embeddings for fine-grained semantic alignment. On standard benchmarks, our approach improves Mean Reciprocal Rank (MRR) by 59.3% and Recall@10 by 27.9% over the state-of-the-art Prospector. It also demonstrates strong generalization on recent CVE data. The core contribution is the first systematic integration of LLM-driven semantic summarization into the VFC–NVD alignment task, significantly enhancing both accuracy and robustness in cross-modal vulnerability localization.
📝 Abstract
Software vulnerabilities pose serious risks to modern software ecosystems. While the National Vulnerability Database (NVD) is the authoritative source for cataloging these vulnerabilities, it often lacks explicit links to the corresponding Vulnerability-Fixing Commits (VFCs). VFCs encode precise code changes, enabling vulnerability localization, patch analysis, and dataset construction. Automatically mapping NVD records to their true VFCs is therefore critical. Existing approaches have limitations as they rely on sparse, often noisy commit messages and fail to capture the deep semantics in the vulnerability descriptions. To address this gap, we introduce PatchSeeker, a novel method that leverages large language models to create rich semantic links between vulnerability descriptions and their VFCs. PatchSeeker generates embeddings from NVD descriptions and enhances commit messages by synthesizing detailed summaries for those that are short or uninformative. These generated messages act as a semantic bridge, effectively closing the information gap between natural language reports and low-level code changes. Our approach PatchSeeker achieves 59.3% higher MRR and 27.9% higher Recall@10 than the best-performing baseline, Prospector, on the benchmark dataset. The extended evaluation on recent CVEs further confirms PatchSeeker's effectiveness. Ablation study shows that both the commit message generation method and the selection of backbone LLMs make a positive contribution to PatchSeeker. We also discuss limitations and open challenges to guide future work.