Aspect-Guided Multi-Level Perturbation Analysis of Large Language Models in Automated Peer Review

📅 2025-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the insufficient robustness of large language models (LLMs) in automated peer review across three core stages—paper evaluation, review generation, and rebuttal response. We propose the first multi-level, aspect-oriented semantic perturbation framework targeting key review quality dimensions: contribution, rigor, clarity, tone, and completeness. Leveraging aspect-driven causal interventions and a dual-role LLM-as-Reviewer / LLM-as-Meta-Reviewer evaluation paradigm, we systematically uncover deep-seated biases—including conclusion-driven bias, misclassification of negative reviews as thorough, and inflated acceptance rates for adversarial rebuttals. Empirical analysis identifies five stable robustness vulnerabilities; all biases remain statistically significant (p < 0.01) across diverse chain-of-thought prompting strategies. The study establishes an interpretable, diagnosable evaluation paradigm and supporting tools for developing trustworthy automated peer review systems.

Technology Category

Application Category

📝 Abstract
We propose an aspect-guided, multi-level perturbation framework to evaluate the robustness of Large Language Models (LLMs) in automated peer review. Our framework explores perturbations in three key components of the peer review process-papers, reviews, and rebuttals-across several quality aspects, including contribution, soundness, presentation, tone, and completeness. By applying targeted perturbations and examining their effects on both LLM-as-Reviewer and LLM-as-Meta-Reviewer, we investigate how aspect-based manipulations, such as omitting methodological details from papers or altering reviewer conclusions, can introduce significant biases in the review process. We identify several potential vulnerabilities: review conclusions that recommend a strong reject may significantly influence meta-reviews, negative or misleading reviews may be wrongly interpreted as thorough, and incomplete or hostile rebuttals can unexpectedly lead to higher acceptance rates. Statistical tests show that these biases persist under various Chain-of-Thought prompting strategies, highlighting the lack of robust critical evaluation in current LLMs. Our framework offers a practical methodology for diagnosing these vulnerabilities, thereby contributing to the development of more reliable and robust automated reviewing systems.
Problem

Research questions and friction points this paper is trying to address.

Evaluate robustness of LLMs in peer review
Identify biases from aspect-based perturbations
Diagnose vulnerabilities in automated reviewing systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aspect-guided perturbation analysis
Multi-level framework evaluation
Targeted vulnerability diagnosis methodology
🔎 Similar Papers
No similar papers found.
Jiatao Li
Jiatao Li
Peking University
NLP
Yanheng Li
Yanheng Li
City University of Hong Kong
Human-Computer InteractionHuman-Robot interaction
X
Xinyu Hu
Wangxuan Institute of Computer Technology, Peking University
M
Mingqi Gao
Wangxuan Institute of Computer Technology, Peking University
Xiaojun Wan
Xiaojun Wan
Peking University
Natural Language ProcessingText MiningArtificial Intelligence