A Review of Ontology-Driven Big Data Analytics in Healthcare: Challenges, Tools, and Applications

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Heterogeneous healthcare big data—including electronic health records (EHRs), medical imaging, wearable sensor streams, and biomedical omics data—frequently devolve into “data swamps” due to weak semantic interoperability, poor discoverability, and insufficient domain-aware access mechanisms. Method: This paper conducts a systematic literature review and proposes a six-category ontology-driven healthcare analytics taxonomy. Methodologically, it integrates ontology modeling, knowledge graph construction, semantic reasoning, and Ontology-Based Data Access (OBDA) within scalable big data infrastructures—including Hadoop, Spark, and Kafka. Contribution/Results: It is the first study to systematically identify emerging trends in semantic interoperability enhancement and synergistic knowledge graph–AI analytics. It synthesizes key technical challenges and pragmatic implementation pathways driven by IoT integration and real-time analytics. Furthermore, it establishes a theoretical framework and an integrated architectural paradigm for building sustainable, interpretable, and scalable healthcare data ecosystems.

Technology Category

Application Category

📝 Abstract
Exponential growth in heterogeneous healthcare data arising from electronic health records (EHRs), medical imaging, wearable sensors, and biomedical research has accelerated the adoption of data lakes and centralized architectures capable of handling the Volume, Variety, and Velocity of Big Data for advanced analytics. However, without effective governance, these repositories risk devolving into disorganized data swamps. Ontology-driven semantic data management offers a robust solution by linking metadata to healthcare knowledge graphs, thereby enhancing semantic interoperability, improving data discoverability, and enabling expressive, domain-aware access. This review adopts a systematic research strategy, formulating key research questions and conducting a structured literature search across major academic databases, with selected studies analyzed and classified into six categories of ontology-driven healthcare analytics: (i) ontology-driven integration frameworks, (ii) semantic modeling for metadata enrichment, (iii) ontology-based data access (OBDA), (iv) basic semantic data management, (v) ontology-based reasoning for decision support, and (vi) semantic annotation for unstructured data. We further examine the integration of ontology technologies with Big Data frameworks such as Hadoop, Spark, Kafka, and so on, highlighting their combined potential to deliver scalable and intelligent healthcare analytics. For each category, recent techniques, representative case studies, technical and organizational challenges, and emerging trends such as artificial intelligence, machine learning, the Internet of Things (IoT), and real-time analytics are reviewed to guide the development of sustainable, interoperable, and high-performance healthcare data ecosystems.
Problem

Research questions and friction points this paper is trying to address.

Managing exponential growth of heterogeneous healthcare data
Addressing semantic interoperability challenges in healthcare analytics
Integrating ontology technologies with Big Data frameworks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ontology-driven semantic data management for healthcare
Integration of ontologies with Big Data frameworks
Semantic modeling and reasoning for decision support
🔎 Similar Papers
No similar papers found.