RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering

📅 2025-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing medical RAG systems over-rely on conceptual knowledge while neglecting structured factual data (e.g., EHRs), resulting in imprecise retrieval and limited clinical applicability. To address this, we propose RGAR—a fact-driven, dual-source collaborative retrieval framework—that pioneers the integration of structured EHR facts with conceptual knowledge from biomedical literature, establishing a fact–concept mutual enhancement loop. RGAR further introduces a generative feedback loop to iteratively refine retrieval and answer generation. Technically, it incorporates EHR fact extraction, cross-source semantic alignment, generation-augmented retrieval, and fine-tuning of Llama-3.1-8B-Instruct. Evaluated on three fact-aware medical QA benchmarks, RGAR achieves state-of-the-art performance, significantly outperforming RAG-enhanced GPT-3.5 in both answer accuracy and clinical credibility.

Technology Category

Application Category

📝 Abstract
Medical question answering requires extensive access to specialized conceptual knowledge. The current paradigm, Retrieval-Augmented Generation (RAG), acquires expertise medical knowledge through large-scale corpus retrieval and uses this knowledge to guide a general-purpose large language model (LLM) for generating answers. However, existing retrieval approaches often overlook the importance of factual knowledge, which limits the relevance of retrieved conceptual knowledge and restricts its applicability in real-world scenarios, such as clinical decision-making based on Electronic Health Records (EHRs). This paper introduces RGAR, a recurrence generation-augmented retrieval framework that retrieves both relevant factual and conceptual knowledge from dual sources (i.e., EHRs and the corpus), allowing them to interact and refine each another. Through extensive evaluation across three factual-aware medical question answering benchmarks, RGAR establishes a new state-of-the-art performance among medical RAG systems. Notably, the Llama-3.1-8B-Instruct model with RGAR surpasses the considerably larger, RAG-enhanced GPT-3.5. Our findings demonstrate the benefit of extracting factual knowledge for retrieval, which consistently yields improved generation quality.
Problem

Research questions and friction points this paper is trying to address.

Enhances medical question answering accuracy
Integrates factual and conceptual knowledge retrieval
Improves clinical decision-making with EHRs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines EHRs and corpus retrieval
Interacts factual and conceptual knowledge
Enhances medical answer generation quality
🔎 Similar Papers
No similar papers found.
S
Sichu Liang
School of Computer Science and Engineering, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Southeast University, Ministry of Education, China
Linhai Zhang
Linhai Zhang
King's College London
Large Language ModelLLM PersonalizationLLM Uncertainty
H
Hongyu Zhu
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, China
Wenwen Wang
Wenwen Wang
Assistant Professor, School of Computing, University of Georgia
Computer Systems
Yulan He
Yulan He
Professor, King's College London; Turing AI Fellow
Natural Language ProcessingLarge Language ModelsAI for education and health
Deyu Zhou
Deyu Zhou
Professor, School of computer science and engineering, SEU
natural language processing