🤖 AI Summary
Early detection of chronic kidney disease (CKD) in outpatient settings is hindered by limited access to kidney-specific biomarkers. To address this, we propose Nephrology-Oriented Representation leArning (NORA), a supervised contrastive learning framework that constructs disease-informed patient representations solely from routinely available, non-renal variables—including demographics, comorbidities, and urinalysis metrics—extracted from electronic health records. These representations are fed into a random forest classifier for automated CKD staging. Evaluated on the Riverside clinical dataset, NORA achieves a 12.3% improvement in F1-score for early CKD identification and demonstrates strong cross-institutional generalizability on the UCI CKD dataset. Crucially, NORA is the first method to explicitly embed domain knowledge into the contrastive learning objective, enabling high-accuracy, interpretable, and biomarker-free early CKD risk stratification—establishing a novel screening paradigm for resource-constrained healthcare settings.
📝 Abstract
Chronic Kidney Disease (CKD) affects millions of people worldwide, yet its early detection remains challenging, especially in outpatient settings where laboratory-based renal biomarkers are often unavailable. In this work, we investigate the predictive potential of routinely collected non-renal clinical variables for CKD classification, including sociodemographic factors, comorbid conditions, and urinalysis findings. We introduce the Nephrology-Oriented Representation leArning (NORA) approach, which combines supervised contrastive learning with a nonlinear Random Forest classifier. NORA first derives discriminative patient representations from tabular EHR data, which are then used for downstream CKD classification. We evaluated NORA on a clinic-based EHR dataset from Riverside Nephrology Physicians. Our results demonstrated that NORA improves class separability and overall classification performance, particularly enhancing the F1-score for early-stage CKD. Additionally, we assessed the generalizability of NORA on the UCI CKD dataset, demonstrating its effectiveness for CKD risk stratification across distinct patient cohorts.