🤖 AI Summary
This study addresses the inefficiency of manual data sifting from multi-source logs in cybersecurity incident analysis. To overcome this limitation, the authors propose a novel approach that integrates query-driven log filtering with Retrieval-Augmented Generation (RAG), uniquely combining a targeted query repository with a RAG architecture. Leveraging large language models (LLMs) for semantic reasoning, the method efficiently extracts indicators of compromise linked to MITRE ATT&CK techniques and reconstructs attack chains within strict context constraints. Experimental results demonstrate that the approach achieves 100% recall in malware scenarios—while reducing DeepSeek V3 inference costs by 15×—and attains 100% precision with 82% recall in Active Directory attack detection, significantly outperforming non-RAG baselines.
📝 Abstract
Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts, network traffic records, and authentication events. This process is labor-intensive: analysts must sift through large volumes of data to identify relevant indicators and piece together what happened. We present a RAG-based system that performs security incident analysis through targeted query-based filtering and LLM semantic reasoning. The system uses a query library with associated MITRE ATT\&CK techniques to extract indicators from raw logs, then retrieves relevant context to answer forensic questions and reconstruct attack sequences. We evaluate the system with five LLM providers on malware traffic incidents and multi-stage Active Directory attacks. We find that LLM models have different performance and tradeoffs, with Claude Sonnet~4 and DeepSeek~V3 achieving 100\% recall across all four malware scenarios, while DeepSeek costs 15$\times$ less (\$0.008 vs.\ \$0.12 per analysis). Attack step detection on Active Directory scenarios reaches 100\% precision and 82\% recall. Ablation studies confirm that a RAG architecture is essential: LLM baselines without RAG-enhanced context correctly identify victim hosts but miss all attack infrastructure including malicious domains and command-and-control servers. These results demonstrate that combining targeted query-based filtering with RAG-based retrieval enables accurate, cost-effective security analysis within LLM context limits.