🤖 AI Summary
This study addresses the urgent need for efficient and scalable tools to automate the clinical staging of Alzheimer’s disease (AD), which is typically time-consuming and highly subjective. The authors propose the first attention-augmented multimodal ordinal regression framework that integrates T1-weighted MRI, demographic, and genetic data for automatic AD severity staging. Their approach innovatively combines a multimodal attention mechanism with ordinal regression and incorporates Grad-CAM++ and SHAP for model interpretability. Rigorous leakage-proof experiments on the ADNI, AIBL, and NIFD datasets demonstrate that the model achieves an adjacent-stage accuracy of 0.970 and exhibits significantly stronger agreement with clinical staging (QWK = 0.549) compared to non-ordinal and unimodal baselines.
📝 Abstract
Neurodegenerative diseases such as Alzheimer's disease (AD) require accurate and scalable tools for assessing disease severity, yet current clinical staging remains time-intensive and prone to variability. We propose an attention-enhanced multimodal machine learning framework with ordinal regression for automated and interpretable AD severity staging. The framework integrates T1-weighted MRI with demographic and genetic variables and compares unimodal and multimodal architectures using ordinal and non-ordinal prediction heads. Models were trained and validated using cohort-stratified splits derived from the ADNI, AIBL, and NIFD datasets. A strictly held-out test set was constructed using subjects excluded from all training, validation, preprocessing, and hyperparameter tuning procedures, with subject-level splitting employed throughout to prevent data leakage. Among unimodal approaches, the T1-weighted MRI model achieved slightly higher adjacent-stage accuracy (0.963) and agreement with clinical staging (QWK 0.444) than the tabular model (QWK 0.433). Integrating imaging, demographic, and genetic information improved overall performance. The multimodal non-ordinal baseline achieved the lowest prediction error (MAE 0.340), whereas the ordinal multimodal model achieved the highest adjacent-stage accuracy (0.970) and strongest agreement with clinical staging (QWK 0.549). These findings indicate that ordinal formulations better capture the ordered structure of the CDR scale and yield predictions more consistent with clinical staging. Explainability analyses using Grad CAM++ and SHAP demonstrated anatomically and clinically plausible model behavior, supporting transparent decision-making. Overall, attention-based multimodal learning with ordinal regression represents a robust, interpretable, and scalable approach for automated AD severity staging and AI-assisted clinical decision support.