Conspiracy theories and where to find them on TikTok

πŸ“… 2024-07-17
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 2
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study presents the first systematic empirical investigation into the dissemination mechanisms, societal harms, and governance of conspiracy-theory content on TikTok. Leveraging a three-year longitudinal dataset of 1.5 million U.S. users’ videos, we construct the first large-scale TikTok conspiracy-theory benchmark dataset to quantify temporal prevalence trends and disentangle the relationship between platform incentive structures (e.g., the Creative Program) and content risk. Methodologically, we evaluate open-source LLMs (Llama, Phi) for ASR-based detection, compare them against fine-tuned RoBERTa, and deploy an ASR–text classification pipeline to estimate incident volume. Results show that creator incentives significantly increase video duration but do not amplify conspiratorial framing; Llama/Phi achieve 96% accuracy at audio transcription but underperform RoBERTa in end-to-end F1; the pipeline estimates ≀1,000 newly emergent conspiracy videos per month. Our work establishes a methodological framework and empirical foundation for mitigating harmful information on short-video platforms.

Technology Category

Application Category

πŸ“ Abstract
TikTok has skyrocketed in popularity over recent years, especially among younger audiences. However, there are public concerns about the potential of this platform to promote and amplify harmful content. This study presents the first systematic analysis of conspiracy theories on TikTok. By leveraging the official TikTok Research API we collect a longitudinal dataset of 1.5M videos shared in the U.S. over three years. We estimate a lower bound on the prevalence of conspiratorial videos (up to 1000 new videos per month) and evaluate the effects of TikTok's Creativity Program for monetization, observing an overall increase in video duration regardless of content. Lastly, we evaluate the capabilities of state-of-the-art open-weight Large Language Models to identify conspiracy theories from audio transcriptions of videos. While these models achieve high precision in detecting harmful content (up to 96%), their overall performance remains comparable to fine-tuned traditional models such as RoBERTa. Our findings suggest that Large Language Models can serve as an effective tool for supporting content moderation strategies aimed at reducing the spread of harmful content on TikTok.
Problem

Research questions and friction points this paper is trying to address.

Analyzing prevalence of conspiracy theories on TikTok
Assessing impact of monetization on conspiratorial content
Evaluating LLMs for detecting harmful TikTok videos
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraged TikTok Research API for data collection
Evaluated TikTok's Creativity Program effects
Tested Large Language Models for content moderation
πŸ”Ž Similar Papers
No similar papers found.