๐ค AI Summary
The proliferation of AI-generated disinformation by large language models (LLMs) poses escalating societal risks. Method: This study employs a university-level competitive empirical design to systematically investigate how humans leverage LLMs (e.g., GPT, Claude) to fabricate disinformation, and comparatively evaluates human annotators versus LLMs in detecting true versus AI-generated false newsโassessing both individual performance and human-AI collaboration. Contribution/Results: We report the first empirical finding that LLMs significantly outperform humans in identifying authentic news (85% vs. 17%), yet match humans only at ~60% accuracy in detecting AI-generated disinformation. Crucially, multimodal (text-image) integration and perceived author credibility substantially degrade detection accuracy. The study further identifies seven distinct authorial strategies that enhance the plausibility of AI-generated disinformation. These findings provide foundational empirical evidence and a methodological framework for understanding adversarial dynamics and collaborative defense mechanisms in human-AI disinformation ecosystems.
๐ Abstract
With the rise of AI-generated content spewed at scale from large language models (LLMs), genuine concerns about the spread of fake news have intensified. The perceived ability of LLMs to produce convincing fake news at scale poses new challenges for both human and automated fake news detection systems. To address this gap, this paper presents the findings from a university-level competition that aimed to explore how LLMs can be used by humans to create fake news, and to assess the ability of human annotators and AI models to detect it. A total of 110 participants used LLMs to create 252 unique fake news stories, and 84 annotators participated in the detection tasks. Our findings indicate that LLMs are ~68% more effective at detecting real news than humans. However, for fake news detection, the performance of LLMs and humans remains comparable (~60% accuracy). Additionally, we examine the impact of visual elements (e.g., pictures) in news on the accuracy of detecting fake news stories. Finally, we also examine various strategies used by fake news creators to enhance the credibility of their AI-generated content. This work highlights the increasing complexity of detecting AI-generated fake news, particularly in collaborative human-AI settings.