🤖 AI Summary
This study addresses the limited interpretability, robustness, and clinical trustworthiness of deep learning models in medical image classification—particularly their misalignment with human reasoning and clinical decision-making. To bridge this gap, we propose an “interpretability-by-design” paradigm that integrates causal inference with biologically inspired visual mechanisms. Our method introduces the CROCODILE general causal framework, a causal feature co-occurrence module, and the neuroscience-informed CoCoReco network. Technically, it unifies activation maximization, prototype-based part learning, feature disentanglement, domain-specific prior injection, and context-aware attention. Evaluated on breast mass classification, the approach delivers radiologist-aligned, interpretable predictions. The framework demonstrates strong cross-domain generalizability in healthcare, significantly improves out-of-distribution (OOD) robustness, and enhances diagnostic credibility. Notably, it establishes the first systematic integration of eXplainable AI (XAI) and causal learning in medical imaging.
📝 Abstract
This work aligns deep learning (DL) with human reasoning capabilities and needs to enable more efficient, interpretable, and robust image classification. We approach this from three perspectives: explainability, causality, and biological vision. Introduction and background open this work before diving into operative chapters. First, we assess neural networks' visualization techniques for medical images and validate an explainable-by-design method for breast mass classification. A comprehensive review at the intersection of XAI and causality follows, where we introduce a general scaffold to organize past and future research, laying the groundwork for our second perspective. In the causality direction, we propose novel modules that exploit feature co-occurrence in medical images, leading to more effective and explainable predictions. We further introduce CROCODILE, a general framework that integrates causal concepts, contrastive learning, feature disentanglement, and prior knowledge to enhance generalization. Lastly, we explore biological vision, examining how humans recognize objects, and propose CoCoReco, a connectivity-inspired network with context-aware attention mechanisms. Overall, our key findings include: (i) simple activation maximization lacks insight for medical imaging DL models; (ii) prototypical-part learning is effective and radiologically aligned; (iii) XAI and causal ML are deeply connected; (iv) weak causal signals can be leveraged without a priori information to improve performance and interpretability; (v) our framework generalizes across medical domains and out-of-distribution data; (vi) incorporating biological circuit motifs improves human-aligned recognition. This work contributes toward human-aligned DL and highlights pathways to bridge the gap between research and clinical adoption, with implications for improved trust, diagnostic accuracy, and safe deployment.