Human-aligned Deep Learning: Explainability, Causality, and Biological Inspiration

📅 2025-04-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the limited interpretability, robustness, and clinical trustworthiness of deep learning models in medical image classification—particularly their misalignment with human reasoning and clinical decision-making. To bridge this gap, we propose an “interpretability-by-design” paradigm that integrates causal inference with biologically inspired visual mechanisms. Our method introduces the CROCODILE general causal framework, a causal feature co-occurrence module, and the neuroscience-informed CoCoReco network. Technically, it unifies activation maximization, prototype-based part learning, feature disentanglement, domain-specific prior injection, and context-aware attention. Evaluated on breast mass classification, the approach delivers radiologist-aligned, interpretable predictions. The framework demonstrates strong cross-domain generalizability in healthcare, significantly improves out-of-distribution (OOD) robustness, and enhances diagnostic credibility. Notably, it establishes the first systematic integration of eXplainable AI (XAI) and causal learning in medical imaging.

Technology Category

Application Category

📝 Abstract

This work aligns deep learning (DL) with human reasoning capabilities and needs to enable more efficient, interpretable, and robust image classification. We approach this from three perspectives: explainability, causality, and biological vision. Introduction and background open this work before diving into operative chapters. First, we assess neural networks' visualization techniques for medical images and validate an explainable-by-design method for breast mass classification. A comprehensive review at the intersection of XAI and causality follows, where we introduce a general scaffold to organize past and future research, laying the groundwork for our second perspective. In the causality direction, we propose novel modules that exploit feature co-occurrence in medical images, leading to more effective and explainable predictions. We further introduce CROCODILE, a general framework that integrates causal concepts, contrastive learning, feature disentanglement, and prior knowledge to enhance generalization. Lastly, we explore biological vision, examining how humans recognize objects, and propose CoCoReco, a connectivity-inspired network with context-aware attention mechanisms. Overall, our key findings include: (i) simple activation maximization lacks insight for medical imaging DL models; (ii) prototypical-part learning is effective and radiologically aligned; (iii) XAI and causal ML are deeply connected; (iv) weak causal signals can be leveraged without a priori information to improve performance and interpretability; (v) our framework generalizes across medical domains and out-of-distribution data; (vi) incorporating biological circuit motifs improves human-aligned recognition. This work contributes toward human-aligned DL and highlights pathways to bridge the gap between research and clinical adoption, with implications for improved trust, diagnostic accuracy, and safe deployment.

Problem

Research questions and friction points this paper is trying to address.

Enhancing interpretability in medical image classification

Integrating causality for robust deep learning predictions

Incorporating biological vision for human-aligned recognition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Explainable-by-design method for medical images

CROCODILE integrates causality and contrastive learning

CoCoReco uses biological vision-inspired attention

🔎 Similar Papers

No similar papers found.

Authors to Follow