🤖 AI Summary
Traditional CTI approaches rely on volatile low-level indicators (e.g., domains, IPs), rendering them insufficiently robust for attributing disinformation campaigns amid rapid infrastructure churn and limiting cross-platform adaptability. To address this, we propose a novel CTI framework specifically designed for disinformation analysis: first, replacing infrastructure-centric indicators with concept-level semantic structures—such as narrative patterns and entity relationships—as stable, interpretable CTI primitives; second, introducing FakeCTI, the first publicly available dataset linking fake news instances, disinformation campaigns, and threat actors; third, integrating fine-tuned large language models (LLMs) with classical NLP techniques to enable semantic-driven concept extraction, contextual modeling, and multi-source attribution. Experiments demonstrate substantial improvements in indicator persistence, cross-platform transferability, and spatiotemporal attribution capability—advancing CTI from ephemeral infrastructure traces to durable, semantics-grounded intelligence.
📝 Abstract
The swift spread of fake news and disinformation campaigns poses a significant threat to public trust, political stability, and cybersecurity. Traditional Cyber Threat Intelligence (CTI) approaches, which rely on low-level indicators such as domain names and social media handles, are easily evaded by adversaries who frequently modify their online infrastructure. To address these limitations, we introduce a novel CTI framework that focuses on high-level, semantic indicators derived from recurrent narratives and relationships of disinformation campaigns. Our approach extracts structured CTI indicators from unstructured disinformation content, capturing key entities and their contextual dependencies within fake news using Large Language Models (LLMs). We further introduce FakeCTI, the first dataset that systematically links fake news to disinformation campaigns and threat actors. To evaluate the effectiveness of our CTI framework, we analyze multiple fake news attribution techniques, spanning from traditional Natural Language Processing (NLP) to fine-tuned LLMs. This work shifts the focus from low-level artifacts to persistent conceptual structures, establishing a scalable and adaptive approach to tracking and countering disinformation campaigns.