CiteFix: Enhancing RAG Accuracy Through Post-Processing Citation Correction

📅 2025-04-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
RAG systems suffer from low citation accuracy (≈74% with existing methods). This paper proposes a lightweight post-hoc citation correction framework comprising a three-tier verification mechanism: keyword matching, fine-tuned BERTScore-based semantic scoring, and lightweight LLM-based re-ranking validation. The method integrates seamlessly into existing RAG pipelines without modifying the core architecture and incurs negligible latency or computational overhead. Experiments demonstrate a 15.46% relative improvement in citation accuracy. Notably, the framework remains effective under model downgrading—achieving comparable performance with a 12× reduction in cost and 3× faster inference. To our knowledge, this is the first work to synergistically combine multi-granularity semantic matching with lightweight LLM validation for citation post-processing, establishing a new paradigm for efficient and high-fidelity RAG.

Technology Category

Application Category

📝 Abstract
Retrieval Augmented Generation (RAG) has emerged as a powerful application of Large Language Models (LLMs), revolutionizing information search and consumption. RAG systems combine traditional search capabilities with LLMs to generate comprehensive answers to user queries, ideally with accurate citations. However, in our experience of developing a RAG product, LLMs often struggle with source attribution, aligning with other industry studies reporting citation accuracy rates of only about 74% for popular generative search engines. To address this, we present efficient post-processing algorithms to improve citation accuracy in LLM-generated responses, with minimal impact on latency and cost. Our approaches cross-check generated citations against retrieved articles using methods including keyword + semantic matching, fine tuned model with BERTScore, and a lightweight LLM-based technique. Our experimental results demonstrate a relative improvement of 15.46% in the overall accuracy metrics of our RAG system. This significant enhancement potentially enables a shift from our current larger language model to a relatively smaller model that is approximately 12x more cost-effective and 3x faster in inference time, while maintaining comparable performance. This research contributes to enhancing the reliability and trustworthiness of AI-generated content in information retrieval and summarization tasks which is critical to gain customer trust especially in commercial products.
Problem

Research questions and friction points this paper is trying to address.

Improving citation accuracy in RAG systems
Reducing latency and cost in LLM-generated responses
Enhancing reliability of AI-generated content
Innovation

Methods, ideas, or system contributions that make the work stand out.

Post-processing algorithms for citation correction
Keyword and semantic matching techniques
Lightweight LLM-based cross-checking method
🔎 Similar Papers
No similar papers found.