Lung-R1: A Knowledge Graph-Guided LLM for Pulmonary Diagnostic Reasoning

📅 2026-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the “knowledge-to-diagnosis gap” between general medical knowledge and personalized reasoning from electronic medical records (EMRs) in pulmonary disease diagnosis. To bridge this gap, the authors construct LungKG, the first comprehensive lung diagnosis knowledge graph encompassing 15 entity types and 112 relation types, and introduce a knowledge graph–guided training paradigm for large language models. By integrating knowledge-constrained chain-of-thought reasoning with reinforcement learning, they develop Lung-R1-14B, a specialized large model for pulmonary diagnosis. The model achieves a score of 4.3583 on the EMR Diagnosis task, significantly outperforming the strongest baseline by +0.1476, and establishes new state-of-the-art performance across three benchmarks: Choice, Pulmonary-QA, and EMR Diagnosis.
📝 Abstract
Diagnosing pulmonary diseases requires integrating heterogeneous evidence amid phenotypic variability and cross-disease overlap. Although large language models (LLMs) have shown progress on pulmonary knowledge question answering (QA) and information-processing tasks, reliable pulmonary diagnosis requires patient-specific, relation-aware reasoning over electronic medical record (EMR) evidence rather than isolated knowledge recall. We define this gap between pulmonary knowledge and case-level diagnostic reasoning as the Pulmonary Knowledge-to-Diagnosis Gap. To address it, we introduce LungKG, the first structured pulmonary knowledge graph for diagnostic knowledge organization and record-grounded reasoning. LungKG contains 59,038 nodes and 164,308 edges across 15 entity types and 112 relation types, serving as both a reusable pulmonary knowledge resource and the foundation for LungKG-guided model adaptation. Built on LungKG, we propose Lung-R1, a LungKG-guided pulmonary LLM trained through KG-constrained reasoning-chain construction and KG-guided reinforcement learning. In a 20-system evaluation, Lung-R1-14B achieves state-of-the-art performance across Choice, Pulmonary-QA, and EMR Diagnosis, reaching an EMR Diagnosis score of 4.3583 and surpassing the strongest non-Lung-R1 baseline by 0.1476 points. These results demonstrate the value of LungKG-guided training for EMR-based pulmonary diagnosis.
Problem

Research questions and friction points this paper is trying to address.

pulmonary diagnosis
knowledge-to-diagnosis gap
electronic medical record
diagnostic reasoning
phenotypic variability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Graph
Large Language Model
Diagnostic Reasoning
Pulmonary Disease
Reinforcement Learning
🔎 Similar Papers
Haoyang Zeng
Haoyang Zeng
Xaira Theurapeutics
Machine LearningProtein DesignPeptide VaccineGene Regulation
Y
Yuanxi Fu
School of Computer Science, Chongqing University, Chongqing, China
R
Rongzhen Li
AI Research Institution, Mashang Financial Institution
Yuming Yang
Yuming Yang
Fudan University
Natural Language ProcessingLarge Language Models
X
Xiao Sun
School of Computer Science, Chongqing University, Chongqing, China
J
Jingwang Huang
School of Computer Science, Chongqing University, Chongqing, China
G
Gujie Shao
School of Computer Science, Chongqing University, Chongqing, China
G
Guohui Xiang
AI Research Institution, Mashang Financial Institution
Q
Quan Lu
AI Research Institution, Mashang Financial Institution
D
Dongfan Ye
Department of Information, Third Military Medical University
X
Xuetao Chen
Department of Information, Third Military Medical University
J
Jiang Zhong
School of Computer Science, Chongqing University, Chongqing, China
K
Kaiwen Wei
School of Computer Science, Chongqing University, Chongqing, China
Z
Zhi Xu
Department of Information, Third Military Medical University