🤖 AI Summary
This work proposes SpecFi, a novel framework for detecting climate misinformation narratives without relying on predefined taxonomies. Addressing the limitations of traditional classification-based approaches in capturing dynamically evolving narratives, SpecFi reformulates narrative detection as an unsupervised retrieval task. It leverages graph clustering to generate community summaries as few-shot exemplars, synthesizes hypothetical documents as queries, and combines BM25 with neural embeddings for retrieval ranking—enabling the identification of emerging narratives without labeled data. The authors introduce a narrative variance metric to quantify retrieval difficulty and demonstrate SpecFi’s effectiveness on the CARDS dataset, achieving a mean average precision (MAP) of 0.505 and significantly outperforming existing baselines, particularly in high-variance narrative scenarios.
📝 Abstract
Detecting climate disinformation narratives typically relies on fixed taxonomies, which do not accommodate emerging narratives. Thus, we re-frame narrative detection as a retrieval task: given a narrative's core message as a query, rank texts from a corpus by alignment with that narrative. This formulation requires no predefined label set and can accommodate emerging narratives. We repurpose three climate disinformation datasets (CARDS, Climate Obstruction, climate change subset of PolyNarrative) for retrieval evaluation and propose SpecFi, a framework that generates hypothetical documents to bridge the gap between abstract narrative descriptions and their concrete textual instantiations. SpecFi uses community summaries from graph-based community detection as few-shot examples for generation, achieving a MAP of 0.505 on CARDS without access to narrative labels. We further introduce narrative variance, an embedding-based difficulty metric, and show via partial correlation analysis that standard retrieval degrades on high-variance narratives (BM25 loses 63.4% of MAP), while SpecFi-CS remains robust (32.7% loss). Our analysis also reveals that unsupervised community summaries converge on descriptions close to expert-crafted taxonomies, suggesting that graph-based methods can surface narrative structure from unlabeled text.