Towards Transparent and Accurate Diabetes Prediction Using Machine Learning and Explainable Artificial Intelligence

📅 2025-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of simultaneously achieving high predictive accuracy and clinical interpretability in early diabetes screening, this study proposes a novel predictive framework integrating imbalance learning, ensemble modeling, and multi-dimensional eXplainable Artificial Intelligence (XAI). The method systematically combines SMOTE-based oversampling, standardized feature engineering, XGBoost and Random Forest ensemble models, and synergistic attribution analysis via SHAP and LIME. Evaluated on a public dataset, the framework achieves 92.50% accuracy and a ROC-AUC of 0.975—significantly outperforming baseline models. Crucially, it identifies clinically actionable core risk factors—including BMI, age, and self-rated health—and generates a clinically intelligible, ranked list of risk drivers. By jointly optimizing predictive performance and medical trustworthiness, the framework enables the development of deployable, decision-support tools for clinical practice.

Technology Category

Application Category

📝 Abstract
Diabetes mellitus (DM) is a global health issue of significance that must be diagnosed as early as possible and managed well. This study presents a framework for diabetes prediction using Machine Learning (ML) models, complemented with eXplainable Artificial Intelligence (XAI) tools, to investigate both the predictive accuracy and interpretability of the predictions from ML models. Data Preprocessing is based on the Synthetic Minority Oversampling Technique (SMOTE) and feature scaling used on the Diabetes Binary Health Indicators dataset to deal with class imbalance and variability of clinical features. The ensemble model provided high accuracy, with a test accuracy of 92.50% and an ROC-AUC of 0.975. BMI, Age, General Health, Income, and Physical Activity were the most influential predictors obtained from the model explanations. The results of this study suggest that ML combined with XAI is a promising means of developing accurate and computationally transparent tools for use in healthcare systems.
Problem

Research questions and friction points this paper is trying to address.

Diabetes Prediction
Early Detection
Disease Management
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine Learning
Explainable AI (XAI)
Diabetes Risk Prediction
🔎 Similar Papers
No similar papers found.