Domain-Specific Knowledge Graphs in RAG-Enhanced Healthcare LLMs

📅 2026-01-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the insufficient factual precision of general-purpose retrieval-augmented generation (RAG) systems in the medical domain by proposing a knowledge graph–based RAG (KG-RAG) strategy guided by the principle of “precision-first, scope-matching.” Leveraging PubMed, the authors construct three disease-specific knowledge graphs and systematically evaluate the impact of retrieval sources on large language model (LLM) output quality in subdomains such as type 2 diabetes mellitus (T2DM) and Alzheimer’s disease, employing probe-based assessment, multi-temperature decoding, and instruction fine-tuning. The findings reveal that scope-matched knowledge graphs (e.g., G₂) substantially improve answer accuracy; small-to-medium models heavily rely on precise retrieval, whereas large models achieve comparable performance even without RAG due to strong parametric priors; and high-temperature decoding yields negligible benefits. These results offer practical guidance for knowledge graph selection and model configuration in medical RAG systems.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) generate fluent answers but can struggle with trustworthy, domain-specific reasoning. We evaluate whether domain knowledge graphs (KGs) improve Retrieval-Augmented Generation (RAG) for healthcare by constructing three PubMed-derived graphs: $\mathbb{G}_1$ (T2DM), $\mathbb{G}_2$ (Alzheimer's disease), and $\mathbb{G}_3$ (AD+T2DM). We design two probes: Probe 1 targets merged AD T2DM knowledge, while Probe 2 targets the intersection of $\mathbb{G}_1$ and $\mathbb{G}_2$. Seven instruction-tuned LLMs are tested across retrieval sources {No-RAG, $\mathbb{G}_1$, $\mathbb{G}_2$, $\mathbb{G}_1$ + $\mathbb{G}_2$, $\mathbb{G}_3$, $\mathbb{G}_1$+$\mathbb{G}_2$ + $\mathbb{G}_3$} and three decoding temperatures. Results show that scope alignment between probe and KG is decisive: precise, scope-matched retrieval (notably $\mathbb{G}_2$) yields the most consistent gains, whereas indiscriminate graph unions often introduce distractors that reduce accuracy. Larger models frequently match or exceed KG-RAG with a No-RAG baseline on Probe 1, indicating strong parametric priors, whereas smaller/mid-sized models benefit more from well-scoped retrieval. Temperature plays a secondary role; higher values rarely help. We conclude that precision-first, scope-matched KG-RAG is preferable to breadth-first unions, and we outline practical guidelines for graph selection, model sizing, and retrieval/reranking. Code and Data available here - https://github.com/sydneyanuyah/RAGComparison

Problem

Research questions and friction points this paper is trying to address.

Domain-Specific Knowledge Graphs

Retrieval-Augmented Generation

Healthcare LLMs

Trustworthy Reasoning

Knowledge Scope Alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Graph

Retrieval-Augmented Generation

Domain-Specific Reasoning

Healthcare LLMs

Scope Alignment

🔎 Similar Papers

No similar papers found.

Authors to Follow