🤖 AI Summary
Late-life depression (LLD) assessment faces challenges in leveraging multi-center structural MRI data due to severe inter-site heterogeneity, limited sample sizes per site, and poor cross-site generalizability. Method: We propose a collaborative domain adaptation framework featuring a dual-branch ViT-CNN architecture—where the ViT branch captures global anatomical patterns and the CNN branch encodes local morphological features—integrated with self-supervised feature alignment, collaborative pseudo-labeling, and consistency regularization under strong-weak data augmentation. Contribution/Results: Evaluated on multi-site T1-weighted MRI datasets, our method significantly outperforms existing unsupervised domain adaptation approaches, achieving substantial gains in classification accuracy. It simultaneously enhances model robustness and clinical deployability across diverse imaging protocols and scanner platforms, demonstrating improved cross-domain representation alignment and discriminative boundary optimization.
📝 Abstract
Accurate identification of late-life depression (LLD) using structural brain MRI is essential for monitoring disease progression and facilitating timely intervention. However, existing learning-based approaches for LLD detection are often constrained by limited sample sizes (e.g., tens), which poses significant challenges for reliable model training and generalization. Although incorporating auxiliary datasets can expand the training set, substantial domain heterogeneity, such as differences in imaging protocols, scanner hardware, and population demographics, often undermines cross-domain transferability. To address this issue, we propose a Collaborative Domain Adaptation (CDA) framework for LLD detection using T1-weighted MRIs. The CDA leverages a Vision Transformer (ViT) to capture global anatomical context and a Convolutional Neural Network (CNN) to extract local structural features, with each branch comprising an encoder and a classifier. The CDA framework consists of three stages: (a) supervised training on labeled source data, (b) self-supervised target feature adaptation and (c) collaborative training on unlabeled target data. We first train ViT and CNN on source data, followed by self-supervised target feature adaptation by minimizing the discrepancy between classifier outputs from two branches to make the categorical boundary clearer. The collaborative training stage employs pseudo-labeled and augmented target-domain MRIs, enforcing prediction consistency under strong and weak augmentation to enhance domain robustness and generalization. Extensive experiments conducted on multi-site T1-weighted MRI data demonstrate that the CDA consistently outperforms state-of-the-art unsupervised domain adaptation methods.