🤖 AI Summary
Prior work lacks a well-defined, data-driven approach to identifying fine-grained review dimensions in peer review. Method: This paper proposes the first fully text-based, data-driven framework for discovering review dimensions directly from real review texts—eschewing reliance on predefined review forms—and adopts a bottom-up paradigm integrating language modeling, unsupervised clustering, topic modeling, and semantic similarity analysis, augmented by human-in-the-loop annotation for dimension induction and validation. Contributions/Results: (1) We release the first high-quality, reusable dataset of expert-annotated review dimensions; (2) we establish an operational, empirically grounded definition system for “review dimensions”; (3) we empirically demonstrate that dimension selection critically impacts downstream tasks—e.g., detecting LLM-generated reviews—with robust cross-conference consistency and alignment with community values, achieving significant gains in detection accuracy.
📝 Abstract
Peer review is central to academic publishing, but the growing volume of submissions is straining the process. This motivates the development of computational approaches to support peer review. While each review is tailored to a specific paper, reviewers often make assessments according to certain aspects such as Novelty, which reflect the values of the research community. This alignment creates opportunities for standardizing the reviewing process, improving quality control, and enabling computational support. While prior work has demonstrated the potential of aspect analysis for peer review assistance, the notion of aspect remains poorly formalized. Existing approaches often derive aspect sets from review forms and guidelines of major NLP venues, yet data-driven methods for aspect identification are largely underexplored. To address this gap, our work takes a bottom-up approach: we propose an operational definition of aspect and develop a data-driven schema for deriving fine-grained aspects from a corpus of peer reviews. We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis. We further show how the choice of aspects can impact downstream applications, such as LLM-generated review detection. Our results lay a foundation for a principled and data-driven investigation of review aspects, and pave the path for new applications of NLP to support peer review.