Evaluating Visual Explanations of Attention Maps for Transformer-Based Medical Imaging

📅 2025-03-12

🏛️ ISIC/iMIMIC/EARTH/DeCaF@MICCAI

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the limited clinical interpretability of Vision Transformers (ViTs) in medical imaging by systematically evaluating the clinical validity of attention maps for localizing critical pathological regions. We propose the first attention explanation evaluation framework tailored to medical imaging, introducing two novel quantitative metrics—*anatomical consistency* and *lesion sensitivity*—to assess attention map fidelity. Our methodology integrates multi-source validation: Grad-CAM and attention rollout visualizations, expert radiologist annotations, and statistical significance testing. Experiments on CheXpert and MIMIC-CXR reveal that only 38% of attention heatmaps achieve high spatial alignment with expert annotations, exposing substantial clinical bias in current ViT interpretability methods. This work quantifies the reliability boundary of Transformer-based attention explanations and establishes a reproducible, clinically grounded evaluation paradigm—thereby guiding future development of trustworthy, deployable explainable AI in radiology.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Evaluating attention maps for explainability in Vision Transformers.

Comparing attention maps with other methods in medical imaging.

Assessing context-dependent efficacy of attention maps in medical decisions.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Attention maps enhance ViT explainability in medical imaging.

Comparison of attention maps with GradCAM and transformer-specific methods.

Attention maps' efficacy varies, limited in comprehensive medical insights.

🔎 Similar Papers

No similar papers found.

Authors to Follow