Compound Deception in Elite Peer Review: A Failure Mode Taxonomy of 100 Fabricated Citations at NeurIPS 2025

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the growing concern that large language models (LLMs) frequently generate fabricated citations in academic writing, which often evade detection during peer review at top-tier venues such as NeurIPS 2025. Through a systematic analysis of 100 AI-generated false references, this work introduces the novel concept of “composite failure modes,” demonstrating how multiple error types—such as semantic plausibility and identifier hijacking—interact to enhance deceptive credibility. Combining manual content analysis with structural citation parsing, the investigation reveals that approximately 1% (53 out of 5,000) of submitted papers contained fabricated references, with 92% featuring only one or two instances and 8% exhibiting heavy reliance. Based on these findings, the authors advocate for integrating automated citation verification mechanisms at the submission stage as an effective mitigation strategy.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly used in academic writing workflows, yet they frequently hallucinate by generating citations to sources that do not exist. This study analyzes 100 AI-generated hallucinated citations that appeared in papers accepted by the 2025 Conference on Neural Information Processing Systems (NeurIPS), one of the world's most prestigious AI conferences. Despite review by 3-5 expert researchers per paper, these fabricated citations evaded detection, appearing in 53 published papers (approx. 1% of all accepted papers). We develop a five-category taxonomy that classifies hallucinations by their failure mode: Total Fabrication (66%), Partial Attribute Corruption (27%), Identifier Hijacking (4%), Placeholder Hallucination (2%), and Semantic Hallucination (1%). Our analysis reveals a critical finding: every hallucination (100%) exhibited compound failure modes. The distribution of secondary characteristics was dominated by Semantic Hallucination (63%) and Identifier Hijacking (29%), which often appeared alongside Total Fabrication to create a veneer of plausibility and false verifiability. These compound structures exploit multiple verification heuristics simultaneously, explaining why peer review fails to detect them. The distribution exhibits a bimodal pattern: 92% of contaminated papers contain 1-2 hallucinations (minimal AI use) while 8% contain 4-13 hallucinations (heavy reliance). These findings demonstrate that current peer review processes do not include effective citation verification and that the problem extends beyond NeurIPS to other major conferences, government reports, and professional consulting. We propose mandatory automated citation verification at submission as an implementable solution to prevent fabricated citations from becoming normalized in scientific literature.

Problem

Research questions and friction points this paper is trying to address.

hallucinated citations

peer review failure

LLM hallucination

fabricated references

citation verification

Innovation

Methods, ideas, or system contributions that make the work stand out.

compound deception

hallucinated citations

failure mode taxonomy

automated citation verification

peer review vulnerability

🔎 Similar Papers

No similar papers found.

Authors to Follow