Structured Extraction of Vulnerabilities in OpenVAS and Tenable WAS Reports Using LLMs

📅 2025-11-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

OpenVAS and Tenable WAS vulnerability scan reports are unstructured and heterogeneous, hindering unified analysis and automation in vulnerability management. Method: We propose the first large language model (LLM)-based cross-tool structuring framework, leveraging GPT-4.1 and DeepSeek with domain-adapted prompt engineering and rule-guided post-processing to extract key fields—including vulnerability description, CVSS score, and affected components—and output standardized JSON. Contribution/Results: Evaluated on 34 real-world vulnerability reports, our approach achieves ROUGE-L scores exceeding 0.7—significantly outperforming traditional rule-based methods—and demonstrates strong generalization across diverse report formats. It enables downstream tasks such as sensitive information anonymization and risk prioritization. This work represents the first LLM-driven unified parsing solution for these two industry-standard scanners, establishing a scalable, low-maintenance paradigm for automated vulnerability governance.

Technology Category

Application Category

📝 Abstract

This paper proposes an automated LLM-based method to extract and structure vulnerabilities from OpenVAS and Tenable WAS scanner reports, converting unstructured data into a standardized format for risk management. In an evaluation using a report with 34 vulnerabilities, GPT-4.1 and DeepSeek achieved the highest similarity to the baseline (ROUGE-L greater than 0.7). The method demonstrates feasibility in transforming complex reports into usable datasets, enabling effective prioritization and future anonymization of sensitive data.

Problem

Research questions and friction points this paper is trying to address.

Extracting structured vulnerability data from security scanner reports

Converting unstructured OpenVAS and Tenable WAS data into standardized format

Enabling effective risk prioritization through automated vulnerability extraction

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs extract vulnerabilities from scanner reports

Converts unstructured data into standardized format

GPT-4 and DeepSeek achieve highest similarity scores

🔎 Similar Papers

No similar papers found.

Authors to Follow