X-SiT: Inherently Interpretable Surface Vision Transformers for Dementia Diagnosis

📅 2025-06-25

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

To address the poor interpretability and clinical intractability of 3D voxel-based brain imaging data, this paper introduces the first intrinsically interpretable cortical surface vision transformer. Methodologically, we propose a case-reasoning–driven prototypical surface patch decoder that enables visual attribution of classification decisions via spatially aligned cortical prototypes; further, we integrate surface patch embedding, geometry-aware positional encoding, and prototype learning to jointly preserve cortical geometry and enhance discriminative capacity. Evaluated on multi-center Alzheimer’s disease and frontotemporal dementia diagnosis tasks, our model achieves state-of-the-art performance. The learned disease-specific cortical prototypes exhibit strong concordance with established neuropathological priors, and the model reliably localizes regions prone to misclassification. These capabilities provide clinicians with intuitive, neurobiologically grounded, and trustworthy interpretability support for diagnostic decision-making.

Technology Category

Application Category

📝 Abstract

Interpretable models are crucial for supporting clinical decision-making, driving advances in their development and application for medical images. However, the nature of 3D volumetric data makes it inherently challenging to visualize and interpret intricate and complex structures like the cerebral cortex. Cortical surface renderings, on the other hand, provide a more accessible and understandable 3D representation of brain anatomy, facilitating visualization and interactive exploration. Motivated by this advantage and the widespread use of surface data for studying neurological disorders, we present the eXplainable Surface Vision Transformer (X-SiT). This is the first inherently interpretable neural network that offers human-understandable predictions based on interpretable cortical features. As part of X-SiT, we introduce a prototypical surface patch decoder for classifying surface patch embeddings, incorporating case-based reasoning with spatially corresponding cortical prototypes. The results demonstrate state-of-the-art performance in detecting Alzheimer's disease and frontotemporal dementia while additionally providing informative prototypes that align with known disease patterns and reveal classification errors.

Problem

Research questions and friction points this paper is trying to address.

Develop interpretable model for dementia diagnosis

Improve visualization of 3D brain structures

Classify surface patches for disease detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Inherently interpretable Surface Vision Transformer

Prototypical surface patch decoder

Human-understandable cortical feature predictions

🔎 Similar Papers

Developing a Dual-Stage Vision Transformer Model for Lung Disease Classification