🤖 AI Summary
Medical deep learning models often lack sufficient interpretability, hindering clinical trust. To address this, we propose a Grad-CAM–based interpretability analysis framework to systematically compare the decision-making mechanisms and diagnostic performance of ResNet50 and DenseNet121 on brain tumor (MRI) and pneumonia (X-ray) classification tasks. Results show that DenseNet121 achieves higher accuracy (94.3% for brain tumors; 89.1% for pneumonia) and generates more precise Grad-CAM heatmaps—consistently focusing on pathologically relevant lesion cores—outperforming ResNet50’s broader, less specific attention patterns. Radiologist evaluation confirms the clinical plausibility of these findings. This work is the first to empirically demonstrate, within a unified interpretability framework, that architectural choice critically influences the transparency and clinical credibility of medical AI models. It provides evidence-based guidance for model selection and deployment in trustworthy medical AI systems.
📝 Abstract
Deep Learning (DL) holds enormous potential for improving medical imaging diagnostics, yet the lack of interpretability in most models hampers clinical trust and adoption. This paper presents an explainable deep learning framework for detecting brain tumors in MRI scans and pneumonia in chest X-ray images using two leading Convolutional Neural Networks, ResNet50 and DenseNet121. These models were trained on publicly available Kaggle datasets comprising 7,023 brain MRI images and 5,863 chest X-ray images, achieving high classification performance. DenseNet121 consistently outperformed ResNet50 with 94.3 percent vs. 92.5 percent accuracy for brain tumors and 89.1 percent vs. 84.4 percent accuracy for pneumonia. For better explainability, Gradient-weighted Class Activation Mapping (Grad-CAM) was integrated to create heatmap visualizations superimposed on the test images, indicating the most influential image regions in the decision-making process. Interestingly, while both models produced accurate results, Grad-CAM showed that DenseNet121 consistently focused on core pathological regions, whereas ResNet50 sometimes scattered attention to peripheral or non-pathological areas. Combining deep learning and explainable AI offers a promising path toward reliable, interpretable, and clinically useful diagnostic tools.