Knowledge Graph Sparsification for GNN-based Rare Disease Diagnosis

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diagnosing rare genetic disorders faces core challenges: scarcity of patient data, limited accessibility of whole-genome sequencing, and the vastness of the pathogenic gene space—particularly acute in resource-constrained settings. To address this, we propose RareNet—the first phenotype-driven method integrating knowledge graph subgraph sparsification with graph neural networks (GNNs). RareNet constructs patient-specific subgraphs solely from standardized clinical phenotypes, enabling high-accuracy prioritization of causal genes. Its key contributions are: (1) structure-aware sparsification that retains biomedically salient relationships; (2) plug-and-play compatibility and intrinsic interpretability—functioning either as a standalone model or as an augmentation module within existing diagnostic pipelines; and (3) superior performance over state-of-the-art methods on two authoritative biomedical benchmarks, demonstrating robustness and scalability in low-resource scenarios.

Technology Category

Application Category

📝 Abstract
Rare genetic disease diagnosis faces critical challenges: insufficient patient data, inaccessible full genome sequencing, and the immense number of possible causative genes. These limitations cause prolonged diagnostic journeys, inappropriate treatments, and critical delays, disproportionately affecting patients in resource-limited settings where diagnostic tools are scarce. We propose RareNet, a subgraph-based Graph Neural Network that requires only patient phenotypes to identify the most likely causal gene and retrieve focused patient subgraphs for targeted clinical investigation. RareNet can function as a standalone method or serve as a pre-processing or post-processing filter for other candidate gene prioritization methods, consistently enhancing their performance while potentially enabling explainable insights. Through comprehensive evaluation on two biomedical datasets, we demonstrate competitive and robust causal gene prediction and significant performance gains when integrated with other frameworks. By requiring only phenotypic data, which is readily available in any clinical setting, RareNet democratizes access to sophisticated genetic analysis, offering particular value for underserved populations lacking advanced genomic infrastructure.
Problem

Research questions and friction points this paper is trying to address.

Diagnosing rare genetic diseases with limited patient data
Overcoming inaccessible genome sequencing in resource-limited settings
Identifying causal genes using only phenotypic information
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses phenotype-only input for gene prioritization
Leverages subgraph-based GNN for rare disease diagnosis
Functions as standalone or integrated filter method
🔎 Similar Papers
No similar papers found.