Instantiating Standards: Enabling Standard-Driven Text TTP Extraction with Evolvable Memory

📅 2025-05-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current TTP extraction methods prioritize data-driven performance over ATT&CK standard consistency, resulting in unreliable tactic–technique assignments and exacerbating inter-organizational discrepancies in threat assessment. To address this, we propose the first ATT&CK-standard-driven LLM framework: it transforms official tactic/technique definitions into actionable, context-aware structured knowledge; and introduces a novel two-level situational memory mechanism that explicitly models contextual semantics and fine-grained discriminative features—balancing standard compliance, interpretability, and human supervisability. Our approach integrates Qwen2.5-32B with ATT&CK-knowledge distillation, dynamic memory updating, and retrieval-augmented reasoning. Experiments demonstrate an 11% improvement in Technique F1 over GPT-4o, significantly enhancing ATT&CK alignment, reasoning transparency, and cross-organizational assessment comparability.

Technology Category

Application Category

📝 Abstract
Extracting MITRE ATT&CK Tactics, Techniques, and Procedures (TTPs) from natural language threat reports is crucial yet challenging. Existing methods primarily focus on performance metrics using data-driven approaches, often neglecting mechanisms to ensure faithful adherence to the official standard. This deficiency compromises reliability and consistency of TTP assignments, creating intelligence silos and contradictory threat assessments across organizations. To address this, we introduce a novel framework that converts abstract standard definitions into actionable, contextualized knowledge. Our method utilizes Large Language Model (LLM) to generate, update, and apply this knowledge. This framework populates an evolvable memory with dual-layer situational knowledge instances derived from labeled examples and official definitions. The first layer identifies situational contexts (e.g.,"Communication with C2 using encoded subdomains"), while the second layer captures distinctive features that differentiate similar techniques (e.g., distinguishing T1132"Data Encoding"from T1071"Application Layer Protocol"based on whether the focus is on encoding methods or protocol usage). This structured approach provides a transparent basis for explainable TTP assignments and enhanced human oversight, while also helping to standardize other TTP extraction systems. Experiments show our framework (using Qwen2.5-32B) boosts Technique F1 scores by 11% over GPT-4o. Qualitative analysis confirms superior standardization, enhanced transparency, and improved explainability in real-world threat intelligence scenarios. To the best of our knowledge, this is the first work that uses the LLM to generate, update, and apply the a new knowledge for TTP extraction.
Problem

Research questions and friction points this paper is trying to address.

Ensures faithful adherence to MITRE ATT &CK standards in TTP extraction
Addresses reliability and consistency issues in TTP assignments across organizations
Provides transparent, explainable TTP extraction with evolvable contextual knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM to generate and update contextualized knowledge
Implements dual-layer memory for situational TTP features
Enhances standardization via evolvable memory framework
🔎 Similar Papers
No similar papers found.
Cheng Meng
Cheng Meng
Institute of Statistics and Big Data, Renmin University of China
Data ScienceOptimal transportSubsamplingSmoothing Spline
Z
ZhengWei Jiang
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Q
QiuYun Wang
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
X
XinYi Li
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
C
ChunYan Ma
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
F
Fangming Dong
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
F
Fangli Ren
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
B
BaoXu Liu
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China