Investigating Document Type, Language, Publication Year, and Author Count Discrepancies Between OpenAlex and Web of Science

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the problem of systematic metadata discrepancies between OpenAlex and Web of Science (WoS) and their potential impact on bibliometric analyses. We systematically compare consistency across four critical metadata dimensions—document type, publication year, language, and author count—via cross-database citation matching, rigorous data cleaning, and multidimensional quantitative consistency assessment. Our method enables the first comprehensive, empirical evaluation of metadata quality differences between these two major scholarly databases. Key findings reveal distinct error patterns: OpenAlex exhibits significant overestimation of author counts and misclassification of document types, whereas WoS underrepresents non-English publications. Year misalignment and language mislabeling further compound inter-database inconsistencies. These results provide empirical evidence and methodological guidance for database selection, interpretation of bibliometric indicators, and metadata curation in research evaluation and science policy.

Technology Category

Application Category

📝 Abstract

Bibliometrics, whether used for research or research evaluation, relies on large multidisciplinary databases of research outputs and citation indices. The Web of Science (WoS) was the main supporting infrastructure of the field for more than 30 years until several new competitors emerged. OpenAlex, a bibliographic database launched in 2022, has distinguished itself for its openness and extensive coverage. While OpenAlex may reduce or eliminate barriers to accessing bibliometric data, one of the concerns that hinders its broader adoption for research and research evaluation is the quality of its metadata. This study aims to assess metadata quality in OpenAlex and WoS, focusing on document type, publication year, language, and number of authors. By addressing discrepancies and misattributions in metadata, this research seeks to enhance awareness of data quality issues that could impact bibliometric research and evaluation outcomes.

Problem

Research questions and friction points this paper is trying to address.

Assessing metadata quality discrepancies between OpenAlex and Web of Science

Investigating document type, publication year, and author count differences

Evaluating how metadata errors impact bibliometric research outcomes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Comparative metadata quality assessment

Analyzing document type and author discrepancies

Evaluating OpenAlex versus Web of Science coverage

🔎 Similar Papers

No similar papers found.

Authors to Follow