🤖 AI Summary
To address the low efficiency and error-proneness of manual extraction of attack testflows from unstructured cybersecurity threat reports, this paper proposes FLOWGUARDIAN—a novel framework that pioneers the deep integration of contextualized event modeling with attack chain reconstruction. Built upon BERT, FLOWGUARDIAN employs a multi-stage NLP pipeline to achieve fine-grained semantic parsing of threat texts, cross-sentence event linking, and dynamic context-aware testflow generation. Evaluated on a public threat report dataset, it achieves 92.3% testflow completeness and 89.7% accuracy—substantially outperforming existing baseline methods. This work establishes a reproducible, high-fidelity paradigm for automated testflow generation, directly supporting executable attack simulation and scalable threat hunting.
📝 Abstract
In the ever-evolving landscape of cybersecurity, the rapid identification and mitigation of Advanced Persistent Threats (APTs) is crucial. Security practitioners rely on detailed threat reports to understand the tactics, techniques, and procedures (TTPs) employed by attackers. However, manually extracting attack testflows from these reports requires elusive knowledge and is time-consuming and prone to errors. This paper proposes FLOWGUARDIAN, a novel solution leveraging language models (i.e., BERT) and Natural Language Processing (NLP) techniques to automate the extraction of attack testflows from unstructured threat reports. FLOWGUARDIAN systematically analyzes and contextualizes security events, reconstructs attack sequences, and then generates comprehensive testflows. This automated approach not only saves time and reduces human error but also ensures comprehensive coverage and robustness in cybersecurity testing. Empirical validation using public threat reports demonstrates FLOWGUARDIAN's accuracy and efficiency, significantly enhancing the capabilities of security teams in proactive threat hunting and incident response.