🤖 AI Summary
This study reveals that AI-assisted peer review systems are vulnerable to low-cost strategic manipulation, jeopardizing the fairness of scientific evaluation. It demonstrates for the first time that superficial linguistic rewrites of paper abstracts—without altering scientific content—can effectively mislead mainstream large language models (e.g., Gemini 1.5 Flash, GPT-4o Mini) in their review judgments, even without knowledge of the underlying model architecture. Using adversarial text rewriting, the approach achieves attack success rates up to 38% across interdisciplinary scenarios (exceeding 50% in rejection contexts), elevates average review scores by 0.88–1.31 points on a 10-point scale, and significantly increases the models’ confidence in erroneous assessments of core dimensions such as scientific validity and significance, thereby exposing fundamental fragility in current AI-based peer review mechanisms.
📝 Abstract
AI is increasingly used to support scientific peer review, from manuscript screening, reviewer assistance to editorial triage. Although such systems promise to reduce reviewer burden and accelerate publication, their robustness to strategic manipulation remains poorly understood. Here we show that AI-mediated peer review is vulnerable to a simple, low-cost manipulation: superficial rephrasing of the manuscript abstract. Without changing the underlying scientific content and communication, and even without knowledge of the reviewing model, adversarially rewritten abstracts substantially improve AI review outcomes. We see this across disciplines and publication venues, for both human-written and AI-generated papers. Our strongest attack achieves an attack-success-rate of about 38%, increasing acceptance ratings by +1.31 for Gemini 3 Flash reviewers and by +0.88 for GPT 5.4 Mini reviewers on a 10-point scale. When the original AI review suggests 'reject', the success rate rises to more than 50%. This effect extends beyond overall score inflation, increasing review confidence and scores on core scientific criteria such as soundness, significance and perceived contribution. The attack is practical, requiring only about 5 minutes and $1 for a 10-page AI conference submission, and is hard to distinguish from ordinary scientific editing. Inflated AI reviews could bias downstream human decision-making, shifting editorial recommendations from rejection towards acceptance. These findings reveal a general vulnerability in AI-assisted scientific evaluation: when AI-generated review influence editorial decisions, authors may be incentivized to optimize manuscripts for AI judgment rather than scientific merit. Our results suggest that AI tools should not be treated as neutral evaluators in high-stakes peer review without systematic robustness testing, transparent safeguards and careful human oversight.