Domain-Specific Knowledge Graphs in RAG-Enhanced Healthcare LLMs

📅 2026-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the insufficient factual precision of general-purpose retrieval-augmented generation (RAG) systems in the medical domain by proposing a knowledge graph–based RAG (KG-RAG) strategy guided by the principle of “precision-first, scope-matching.” Leveraging PubMed, the authors construct three disease-specific knowledge graphs and systematically evaluate the impact of retrieval sources on large language model (LLM) output quality in subdomains such as type 2 diabetes mellitus (T2DM) and Alzheimer’s disease, employing probe-based assessment, multi-temperature decoding, and instruction fine-tuning. The findings reveal that scope-matched knowledge graphs (e.g., G₂) substantially improve answer accuracy; small-to-medium models heavily rely on precise retrieval, whereas large models achieve comparable performance even without RAG due to strong parametric priors; and high-temperature decoding yields negligible benefits. These results offer practical guidance for knowledge graph selection and model configuration in medical RAG systems.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) generate fluent answers but can struggle with trustworthy, domain-specific reasoning. We evaluate whether domain knowledge graphs (KGs) improve Retrieval-Augmented Generation (RAG) for healthcare by constructing three PubMed-derived graphs: $\mathbb{G}_1$ (T2DM), $\mathbb{G}_2$ (Alzheimer's disease), and $\mathbb{G}_3$ (AD+T2DM). We design two probes: Probe 1 targets merged AD T2DM knowledge, while Probe 2 targets the intersection of $\mathbb{G}_1$ and $\mathbb{G}_2$. Seven instruction-tuned LLMs are tested across retrieval sources {No-RAG, $\mathbb{G}_1$, $\mathbb{G}_2$, $\mathbb{G}_1$ + $\mathbb{G}_2$, $\mathbb{G}_3$, $\mathbb{G}_1$+$\mathbb{G}_2$ + $\mathbb{G}_3$} and three decoding temperatures. Results show that scope alignment between probe and KG is decisive: precise, scope-matched retrieval (notably $\mathbb{G}_2$) yields the most consistent gains, whereas indiscriminate graph unions often introduce distractors that reduce accuracy. Larger models frequently match or exceed KG-RAG with a No-RAG baseline on Probe 1, indicating strong parametric priors, whereas smaller/mid-sized models benefit more from well-scoped retrieval. Temperature plays a secondary role; higher values rarely help. We conclude that precision-first, scope-matched KG-RAG is preferable to breadth-first unions, and we outline practical guidelines for graph selection, model sizing, and retrieval/reranking. Code and Data available here - https://github.com/sydneyanuyah/RAGComparison
Problem

Research questions and friction points this paper is trying to address.

Domain-Specific Knowledge Graphs
Retrieval-Augmented Generation
Healthcare LLMs
Trustworthy Reasoning
Knowledge Scope Alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Graph
Retrieval-Augmented Generation
Domain-Specific Reasoning
Healthcare LLMs
Scope Alignment
🔎 Similar Papers
No similar papers found.
S
Sydney Anuyah
Luddy School of Informatics, Computing, and Engineering, Indiana University, Indianapolis, IN, USA
M
Mehedi Mahmud Kaushik
Luddy School of Informatics, Computing, and Engineering, Indiana University, Indianapolis, IN, USA
H
Hao Dai
School of Medicine, Indiana University, Indianapolis, IN, USA
R
R. Shiradkar
Department of Biomedical Engineering and Informatics, Indiana University, Indianapolis, IN, USA
A
A. Durresi
Luddy School of Informatics, Computing, and Engineering, Indiana University, Indianapolis, IN, USA
Sunandan Chakraborty
Sunandan Chakraborty
Indiana University IUPUI
Data ScienceText MiningComputational SustainabilityICTD