Metrics to Detect Small-Scale and Large-Scale Citation Orchestration

📅 2024-06-27
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address citation metric distortion—particularly in the h-index—caused by deliberate manipulation in academic evaluation, this paper proposes a novel, quantifiable method for detecting both small-scale (e.g., self-citation and reciprocal citation within tight-knit groups) and large-scale (e.g., hyper-collaborative network–driven) citation manipulation. It formally defines and distinguishes these two manipulation patterns for the first time. The method introduces three unsupervised detection indicators: h-index anomaly, citation source concentration, and coauthor network scale—overcoming the limitations of single-metric approaches. Leveraging full-disciplinary Scopus data, the study integrates statistical distribution analysis, percentile-based thresholds (1% and 5%), and citation network structural analysis to characterize cross-disciplinary manipulation prevalence. Results yield actionable, empirically grounded detection thresholds that effectively identify anomalously high h-indices inconsistent with scholarly impact, thereby offering a robust new tool for academic integrity assessment.

Technology Category

Application Category

📝 Abstract
Citation counts and related metrics have pervasive uses and misuses in academia and research appraisal, serving as scholarly influence and recognition measures. Hence, comprehending the citation patterns exhibited by authors is essential for assessing their research impact and contributions within their respective fields. Although the h-index, introduced by Hirsch in 2005, has emerged as a popular bibliometric indicator, it fails to account for the intricate relationships between authors and their citation patterns. This limitation becomes particularly relevant in cases where citations are strategically employed to boost the perceived influence of certain individuals or groups, a phenomenon that we term"orchestration". Orchestrated citations can introduce biases in citation rankings and therefore necessitate the identification of such patterns. Here, we use Scopus data to investigate orchestration of citations across all scientific disciplines. Orchestration could be small-scale, when the author him/herself and/or a small number of other authors use citations strategically to boost citation metrics like h-index; or large-scale, where extensive collaborations among many co-authors lead to high h-index for many/all of them. We propose three orchestration indicators: extremely low values in the ratio of citations over the square of the h-index (indicative of small-scale orchestration); extremely small number of authors who can explain at least 50% of an author's total citations (indicative of either small-scale or large-scale orchestration); and extremely large number of co-authors with more than 50 co-authored papers (indicative of large-scale orchestration). The distributions, potential thresholds based on 1% (and 5%) percentiles, and insights from these indicators are explored and put into perspective across science.
Problem

Research questions and friction points this paper is trying to address.

Detect small-scale citation orchestration by individual authors
Identify large-scale orchestration through extensive co-author collaborations
Develop metrics to uncover biases in citation rankings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Scopus data for citation analysis
Proposes three orchestration detection indicators
Analyzes small and large-scale citation patterns
🔎 Similar Papers
No similar papers found.
Iakovos Evdaimon
Iakovos Evdaimon
LIX, École Polytechnique, Institut Polytechnique de Paris, Rte de Saclay, Palaiseau, 91120, France
J
John P. A. Ioannidis
Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA 94305
Giannis Nikolentzos
Giannis Nikolentzos
Assistant Professor, University of Peloponnese
Graph Machine LearningArtificial IntelligenceMachine LearningGraph Mining
Michail Chatzianastasis
Michail Chatzianastasis
Natera
Machine LearningGraph Representation LearningGraph Neural Networks
G
G. Panagopoulos
Department of Computer Science, University of Luxembourg, Maison du Nombre 6, avenue de la Fonte L-4364 Esch-sur-Alzette, Luxembourg
M
M. Vazirgiannis
LIX, École Polytechnique, Institut Polytechnique de Paris, Rte de Saclay, Palaiseau, 91120, France; Mohamed bin Zayed University of Artificial Intelligence, Masdar City, Abu Dhabi, United Arab Emirates