Structured Extraction of Vulnerabilities in OpenVAS and Tenable WAS Reports Using LLMs

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
OpenVAS and Tenable WAS vulnerability scan reports are unstructured and heterogeneous, hindering unified analysis and automation in vulnerability management. Method: We propose the first large language model (LLM)-based cross-tool structuring framework, leveraging GPT-4.1 and DeepSeek with domain-adapted prompt engineering and rule-guided post-processing to extract key fields—including vulnerability description, CVSS score, and affected components—and output standardized JSON. Contribution/Results: Evaluated on 34 real-world vulnerability reports, our approach achieves ROUGE-L scores exceeding 0.7—significantly outperforming traditional rule-based methods—and demonstrates strong generalization across diverse report formats. It enables downstream tasks such as sensitive information anonymization and risk prioritization. This work represents the first LLM-driven unified parsing solution for these two industry-standard scanners, establishing a scalable, low-maintenance paradigm for automated vulnerability governance.

Technology Category

Application Category

📝 Abstract
This paper proposes an automated LLM-based method to extract and structure vulnerabilities from OpenVAS and Tenable WAS scanner reports, converting unstructured data into a standardized format for risk management. In an evaluation using a report with 34 vulnerabilities, GPT-4.1 and DeepSeek achieved the highest similarity to the baseline (ROUGE-L greater than 0.7). The method demonstrates feasibility in transforming complex reports into usable datasets, enabling effective prioritization and future anonymization of sensitive data.
Problem

Research questions and friction points this paper is trying to address.

Extracting structured vulnerability data from security scanner reports
Converting unstructured OpenVAS and Tenable WAS data into standardized format
Enabling effective risk prioritization through automated vulnerability extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs extract vulnerabilities from scanner reports
Converts unstructured data into standardized format
GPT-4 and DeepSeek achieve highest similarity scores
🔎 Similar Papers
No similar papers found.
B
Beatriz Machado
AI Horizon Labs, Federal University of Pampa (UNIPAMPA)
D
Douglas Lautert
AI Horizon Labs, Federal University of Pampa (UNIPAMPA)
C
Cristhian Kapelinski
AI Horizon Labs, Federal University of Pampa (UNIPAMPA)
Diego Kreutz
Diego Kreutz
Federal University of Pampa (UNIPAMPA)
AutoML&XAI&AML for CybersecurityNetwork SecurityMalware & Attack DetectionBlockchainsSystems