LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification

📅 2025-07-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Traditional cyber threat intelligence (CTI) credibility assessment predominantly relies on static classification paradigms—either handcrafted feature engineering or isolated deep learning models—rendering it ill-suited for handling CTI’s inherent incompleteness, heterogeneity, and noise, while also lacking interpretability. To address these limitations, this paper proposes a large language model (LLM)-based multi-step reasoning framework that integrates adaptive information extraction, iterative evidence retrieval, and prompt-driven natural language inference to enable dynamic and transparent credibility evaluation. The framework end-to-end models the entire CTI verification process, substantially enhancing decision robustness and auditability. Evaluated on the CTI-200 and PolitiFact benchmarks, it achieves 90.9% macro-F1 and 93.6% micro-F1, outperforming state-of-the-art approaches by a statistically significant margin.

Technology Category

Application Category

📝 Abstract

Verifying the credibility of Cyber Threat Intelligence (CTI) is essential for reliable cybersecurity defense. However, traditional approaches typically treat this task as a static classification problem, relying on handcrafted features or isolated deep learning models. These methods often lack the robustness needed to handle incomplete, heterogeneous, or noisy intelligence, and they provide limited transparency in decision-making-factors that reduce their effectiveness in real-world threat environments. To address these limitations, we propose LRCTI, a Large Language Model (LLM)-based framework designed for multi-step CTI credibility verification. The framework first employs a text summarization module to distill complex intelligence reports into concise and actionable threat claims. It then uses an adaptive multi-step evidence retrieval mechanism that iteratively identifies and refines supporting information from a CTI-specific corpus, guided by LLM feedback. Finally, a prompt-based Natural Language Inference (NLI) module is applied to evaluate the credibility of each claim while generating interpretable justifications for the classification outcome. Experiments conducted on two benchmark datasets, CTI-200 and PolitiFact show that LRCTI improves F1-Macro and F1-Micro scores by over 5%, reaching 90.9% and 93.6%, respectively, compared to state-of-the-art baselines. These results demonstrate that LRCTI effectively addresses the core limitations of prior methods, offering a scalable, accurate, and explainable solution for automated CTI credibility verification

Problem

Research questions and friction points this paper is trying to address.

Handling incomplete, heterogeneous, or noisy CTI data

Providing transparent decision-making in CTI credibility verification

Improving accuracy and scalability in automated CTI verification

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based multi-step CTI credibility verification

Adaptive evidence retrieval with LLM feedback

Prompt-based NLI for interpretable justifications

🔎 Similar Papers

No similar papers found.

Authors to Follow