🤖 AI Summary
Author name disambiguation in academic search is often hindered by cross-source inconsistencies and error propagation, while reliance on manual annotation incurs prohibitive costs. This work proposes CrossND, a novel framework that, for the first time, leverages cross-source inconsistency as a corrective signal to enable fully automated and highly robust disambiguation. CrossND integrates data cleaning, probabilistic soft logic reasoning, and test-time scaling into a chained refinement pipeline, eliminating the need for expert-labeled training data. Evaluated on real-world datasets, the method significantly outperforms 17 strong baselines, demonstrating the efficacy of cross-source reasoning in enhancing both accuracy and robustness in author name disambiguation.
📝 Abstract
Author name disambiguation is a critical challenge in academic search systems, often addressed through from-scratch and real-time disambiguation approaches. However, current algorithms remain vulnerable to cumulative errors of paper-author assignments and overlook inconsistent assignments across different sources. Resorting to expert annotation is resource-intensive. To this end, this paper explores a new perspective for author name disambiguation: cross-source correction by leveraging inconsistent assignments across sources. We propose CrossND, a full-stack framework that integrates data refinement, cross-source reasoning, and test-time scaling. First, a chain-of-refinement pipeline denoises author profiles and produces more accurate paper-author matching probabilities. Second, a supervised fine-tuning process incorporates these refined signals and a probabilistic soft logic-based cross-correction module to infer the assignments of which sources are incorrect. Third, test-time scaling further enhances the accuracy and robustness of the predictions. Experiments on real-world datasets indicate that CrossND consistently outperforms 17 baselines by leveraging cross-source reasoning without human intervention.