🤖 AI Summary
This study systematically evaluates open-source large language models (LLMs) for detecting conspiracy-theory content in real YouTube videos. Addressing platform-level misinformation risks, we compare zero-shot text-based LLMs (e.g., LLaMA), multimodal models (e.g., Qwen-VL), and fine-tuned RoBERTa baselines, using video metadata—including titles, descriptions, and subtitles—as input. Results show that text-based LLMs achieve high recall but suffer from substantially low precision; multimodal fusion yields no performance gain, confirming the dominance of textual features; and RoBERTa attains LLM-comparable performance with orders-of-magnitude fewer parameters, demonstrating strong generalization even without labeled data. To our knowledge, this is the first empirical evaluation of open-source LLMs’ precision–recall trade-offs in authentic YouTube settings. Our findings highlight the practical utility of lightweight models under resource constraints and provide a deployable, scalable solution for platform-level misinformation mitigation.
📝 Abstract
As a leading online platform with a vast global audience, YouTube's extensive reach also makes it susceptible to hosting harmful content, including disinformation and conspiracy theories. This study explores the use of open-weight Large Language Models (LLMs), both text-only and multimodal, for identifying conspiracy theory videos shared on YouTube. Leveraging a labeled dataset of thousands of videos, we evaluate a variety of LLMs in a zero-shot setting and compare their performance to a fine-tuned RoBERTa baseline. Results show that text-based LLMs achieve high recall but lower precision, leading to increased false positives. Multimodal models lag behind their text-only counterparts, indicating limited benefits from visual data integration. To assess real-world applicability, we evaluate the most accurate models on an unlabeled dataset, finding that RoBERTa achieves performance close to LLMs with a larger number of parameters. Our work highlights the strengths and limitations of current LLM-based approaches for online harmful content detection, emphasizing the need for more precise and robust systems.