Gaming AI-Assisted Peer Reviews Poses New Risks to the Scientific Community

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study reveals that AI-assisted peer review systems are vulnerable to low-cost strategic manipulation, jeopardizing the fairness of scientific evaluation. It demonstrates for the first time that superficial linguistic rewrites of paper abstracts—without altering scientific content—can effectively mislead mainstream large language models (e.g., Gemini 1.5 Flash, GPT-4o Mini) in their review judgments, even without knowledge of the underlying model architecture. Using adversarial text rewriting, the approach achieves attack success rates up to 38% across interdisciplinary scenarios (exceeding 50% in rejection contexts), elevates average review scores by 0.88–1.31 points on a 10-point scale, and significantly increases the models’ confidence in erroneous assessments of core dimensions such as scientific validity and significance, thereby exposing fundamental fragility in current AI-based peer review mechanisms.

📝 Abstract

AI is increasingly used to support scientific peer review, from manuscript screening, reviewer assistance to editorial triage. Although such systems promise to reduce reviewer burden and accelerate publication, their robustness to strategic manipulation remains poorly understood. Here we show that AI-mediated peer review is vulnerable to a simple, low-cost manipulation: superficial rephrasing of the manuscript abstract. Without changing the underlying scientific content and communication, and even without knowledge of the reviewing model, adversarially rewritten abstracts substantially improve AI review outcomes. We see this across disciplines and publication venues, for both human-written and AI-generated papers. Our strongest attack achieves an attack-success-rate of about 38%, increasing acceptance ratings by +1.31 for Gemini 3 Flash reviewers and by +0.88 for GPT 5.4 Mini reviewers on a 10-point scale. When the original AI review suggests 'reject', the success rate rises to more than 50%. This effect extends beyond overall score inflation, increasing review confidence and scores on core scientific criteria such as soundness, significance and perceived contribution. The attack is practical, requiring only about 5 minutes and $1 for a 10-page AI conference submission, and is hard to distinguish from ordinary scientific editing. Inflated AI reviews could bias downstream human decision-making, shifting editorial recommendations from rejection towards acceptance. These findings reveal a general vulnerability in AI-assisted scientific evaluation: when AI-generated review influence editorial decisions, authors may be incentivized to optimize manuscripts for AI judgment rather than scientific merit. Our results suggest that AI tools should not be treated as neutral evaluators in high-stakes peer review without systematic robustness testing, transparent safeguards and careful human oversight.

Problem

Research questions and friction points this paper is trying to address.

AI-assisted peer review

strategic manipulation

adversarial rewriting

review robustness

scientific evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

adversarial manipulation

AI-assisted peer review

scientific evaluation vulnerability