Cross-Source Reasoning-based Correction for Author Name Disambiguation

📅 2026-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Author name disambiguation in academic search is often hindered by cross-source inconsistencies and error propagation, while reliance on manual annotation incurs prohibitive costs. This work proposes CrossND, a novel framework that, for the first time, leverages cross-source inconsistency as a corrective signal to enable fully automated and highly robust disambiguation. CrossND integrates data cleaning, probabilistic soft logic reasoning, and test-time scaling into a chained refinement pipeline, eliminating the need for expert-labeled training data. Evaluated on real-world datasets, the method significantly outperforms 17 strong baselines, demonstrating the efficacy of cross-source reasoning in enhancing both accuracy and robustness in author name disambiguation.
📝 Abstract
Author name disambiguation is a critical challenge in academic search systems, often addressed through from-scratch and real-time disambiguation approaches. However, current algorithms remain vulnerable to cumulative errors of paper-author assignments and overlook inconsistent assignments across different sources. Resorting to expert annotation is resource-intensive. To this end, this paper explores a new perspective for author name disambiguation: cross-source correction by leveraging inconsistent assignments across sources. We propose CrossND, a full-stack framework that integrates data refinement, cross-source reasoning, and test-time scaling. First, a chain-of-refinement pipeline denoises author profiles and produces more accurate paper-author matching probabilities. Second, a supervised fine-tuning process incorporates these refined signals and a probabilistic soft logic-based cross-correction module to infer the assignments of which sources are incorrect. Third, test-time scaling further enhances the accuracy and robustness of the predictions. Experiments on real-world datasets indicate that CrossND consistently outperforms 17 baselines by leveraging cross-source reasoning without human intervention.
Problem

Research questions and friction points this paper is trying to address.

author name disambiguation
cross-source reasoning
paper-author assignment
data inconsistency
academic search
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-source reasoning
author name disambiguation
probabilistic soft logic
test-time scaling
data refinement
🔎 Similar Papers
No similar papers found.