X-SiT: Inherently Interpretable Surface Vision Transformers for Dementia Diagnosis

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor interpretability and clinical intractability of 3D voxel-based brain imaging data, this paper introduces the first intrinsically interpretable cortical surface vision transformer. Methodologically, we propose a case-reasoning–driven prototypical surface patch decoder that enables visual attribution of classification decisions via spatially aligned cortical prototypes; further, we integrate surface patch embedding, geometry-aware positional encoding, and prototype learning to jointly preserve cortical geometry and enhance discriminative capacity. Evaluated on multi-center Alzheimer’s disease and frontotemporal dementia diagnosis tasks, our model achieves state-of-the-art performance. The learned disease-specific cortical prototypes exhibit strong concordance with established neuropathological priors, and the model reliably localizes regions prone to misclassification. These capabilities provide clinicians with intuitive, neurobiologically grounded, and trustworthy interpretability support for diagnostic decision-making.

Technology Category

Application Category

📝 Abstract
Interpretable models are crucial for supporting clinical decision-making, driving advances in their development and application for medical images. However, the nature of 3D volumetric data makes it inherently challenging to visualize and interpret intricate and complex structures like the cerebral cortex. Cortical surface renderings, on the other hand, provide a more accessible and understandable 3D representation of brain anatomy, facilitating visualization and interactive exploration. Motivated by this advantage and the widespread use of surface data for studying neurological disorders, we present the eXplainable Surface Vision Transformer (X-SiT). This is the first inherently interpretable neural network that offers human-understandable predictions based on interpretable cortical features. As part of X-SiT, we introduce a prototypical surface patch decoder for classifying surface patch embeddings, incorporating case-based reasoning with spatially corresponding cortical prototypes. The results demonstrate state-of-the-art performance in detecting Alzheimer's disease and frontotemporal dementia while additionally providing informative prototypes that align with known disease patterns and reveal classification errors.
Problem

Research questions and friction points this paper is trying to address.

Develop interpretable model for dementia diagnosis
Improve visualization of 3D brain structures
Classify surface patches for disease detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inherently interpretable Surface Vision Transformer
Prototypical surface patch decoder
Human-understandable cortical feature predictions
🔎 Similar Papers
No similar papers found.