Utilizing Large Language Models for Zero-Shot Medical Ontology Extension from Clinical Notes

📅 2025-11-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient coverage in medical ontologies, this work proposes CLOZE—the first zero-shot, annotation-free, and privacy-preserving clinical-note-driven ontology expansion framework. Methodologically, CLOZE leverages large language models (LLMs) to directly identify novel medical concepts and model their hierarchical relationships from unstructured clinical notes, while integrating de-identification techniques to automatically remove protected health information (PHI); no fine-tuning or labeled training data is required. Its key contributions are: (1) the first LLM-native zero-shot ontology expansion approach; (2) joint modeling of domain-specific knowledge and taxonomic structure; and (3) end-to-end automated ontology construction under strict privacy constraints. Experiments demonstrate that CLOZE significantly outperforms existing methods in concept discovery accuracy, cross-institutional scalability, and regulatory compliance—effectively supporting downstream applications such as terminology standardization and clinical decision support.

Technology Category

Application Category

📝 Abstract
Integrating novel medical concepts and relationships into existing ontologies can significantly enhance their coverage and utility for both biomedical research and clinical applications. Clinical notes, as unstructured documents rich with detailed patient observations, offer valuable context-specific insights and represent a promising yet underutilized source for ontology extension. Despite this potential, directly leveraging clinical notes for ontology extension remains largely unexplored. To address this gap, we propose CLOZE, a novel framework that uses large language models (LLMs) to automatically extract medical entities from clinical notes and integrate them into hierarchical medical ontologies. By capitalizing on the strong language understanding and extensive biomedical knowledge of pre-trained LLMs, CLOZE effectively identifies disease-related concepts and captures complex hierarchical relationships. The zero-shot framework requires no additional training or labeled data, making it a cost-efficient solution. Furthermore, CLOZE ensures patient privacy through automated removal of protected health information (PHI). Experimental results demonstrate that CLOZE provides an accurate, scalable, and privacy-preserving ontology extension framework, with strong potential to support a wide range of downstream applications in biomedical research and clinical informatics.
Problem

Research questions and friction points this paper is trying to address.

Automatically extending medical ontologies from clinical notes
Extracting disease concepts and hierarchical relationships using LLMs
Providing zero-shot privacy-preserving ontology extension framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs extract medical entities from clinical notes
Zero-shot framework integrates concepts into ontologies
Automated PHI removal ensures patient privacy protection
🔎 Similar Papers
No similar papers found.
G
Guanchen Wu
Department of Computer Science, Emory University
Y
Yuzhang Xie
Department of Computer Science, Emory University
H
Huanwei Wu
College of Public Health, Temple University
Zhe He
Zhe He
University of Macau
deep learningreinforcement learningPOMDPs
Hui Shao
Hui Shao
Hubert Department of Global Health, Emory University
X
Xiao Hu
Nell Hodgson Woodruff School of Nursing, Emory University
Carl Yang
Carl Yang
Waymo LLC, PhD at University of California, Davis
GPU ComputingParallel ComputingGraph Processing