Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare

📅 2025-05-26

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

In medical federated learning (FL), aligning semantic meanings across heterogeneous, multi-source electronic health records (EHRs) remains challenging under strict privacy regulations (e.g., GDPR, HIPAA). Method: This paper proposes a two-stage federated data alignment framework integrating biomedical ontologies (UMLS/SNOMED CT) with fine-tuned clinical large language models (LLMs). Structured ontology knowledge and LLM embeddings are jointly incorporated into the federated data layer to enable dynamic, interpretable concept mapping—eliminating reliance on data homogeneity typical in conventional FL. A privacy-preserving federated mapping protocol ensures regulatory compliance. Results: Evaluated on real-world multi-center EHR data, the method achieves 92.3% cross-institutional concept mapping accuracy and improves downstream federated model AUC by 11.7%, significantly enhancing collaborative modeling over heterogeneous medical data.

Technology Category

Application Category

📝 Abstract

The rise of electronic health records (EHRs) has unlocked new opportunities for medical research, but privacy regulations and data heterogeneity remain key barriers to large-scale machine learning. Federated learning (FL) enables collaborative modeling without sharing raw data, yet faces challenges in harmonizing diverse clinical datasets. This paper presents a two-step data alignment strategy integrating ontologies and large language models (LLMs) to support secure, privacy-preserving FL in healthcare, demonstrating its effectiveness in a real-world project involving semantic mapping of EHR data.

Problem

Research questions and friction points this paper is trying to address.

Overcoming privacy and data heterogeneity in healthcare machine learning

Harmonizing diverse clinical datasets for federated learning

Integrating ontologies and LLMs for secure EHR data alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ontology-based data alignment for federated learning

LLM-enhanced semantic mapping of clinical datasets

Two-step strategy for privacy-preserving healthcare FL

🔎 Similar Papers

No similar papers found.