Detection and Interpretability Analysis of Quotation Errors by Large Language Models

📅 2026-06-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Citation errors are pervasive in academic writing, and manual verification is inefficient, necessitating automated solutions. This work proposes the first application of fine-tuned large language models to citation error detection, introducing a novel dataset that incorporates full-text information from cited references. The study systematically evaluates three strategies for integrating full-text content and demonstrates that leveraging abstracts of source documents yields the best performance, significantly enhancing detection accuracy. Furthermore, the authors employ TokenSHAP for interpretability analysis, uncovering the rationale behind model predictions. This approach not only improves the reliability of citation validation but also supports fairness and integrity in scholarly evaluation.

📝 Abstract

Purpose - Quotation error refers to the inconsistency between cited information and its original source. This phenomenon leads to a series of negative impacts, such as misinterpretation of the original research, undermining the academic community's collective understanding of relevant issues, and weakening the accuracy and fairness of the citation-based academic evaluation system. Existing studies have shown that quotation error is prevalent in the academic community; moreover, manual verification of quotation error is not only labor-intensive but also inefficient. Therefore, this paper proposes the task of 'automated detection of quotation errors'. Methodology - Adopting a large language model (LLM)-based approach, this paper improves detection performance from two aspects on the basis of existing research: first, employ the fine-tuning approach for LLMs to detect quotation errors; second, incorporating full-text data of the cited literature into dataset construction, and exploring the optimal scheme for building such datasets by comparing three types of full-text integration methods. Based on this, this paper further uses the TokenSHAP tool to conduct interpretability experimental analysis on the model's prediction results. Findings - The fine-tuning approach for LLMs has improved the performance in detecting quotation errors. Among the different methods for incorporating full-text information, the approach based on using the source abstract yielded the best performance. Originality - The fine-tuning approach for large language models (LLMs) is applied to the task of automated detection of quotation errors, and interpretability analysis is conducted on the model's output results.

Problem

Research questions and friction points this paper is trying to address.

quotation error

automated detection

citation inconsistency

academic integrity

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

large language models

quotation error detection

fine-tuning