🤖 AI Summary
This work addresses the degradation in generalization performance in medical image classification caused by domain shifts arising from multimodality, noise, and label scarcity. The authors propose a domain adaptation method based on shared feature space alignment, integrated within a federated learning framework and systematically evaluated with explainability analysis (GradCAM++) and classifier calibration. Experiments across ten deep architectures—including ResNet34—and multiple medical imaging datasets demonstrate that the approach improves accuracy by 4.7% in brain tumor classification, enhances robustness to noise by approximately 3%, and reduces expected calibration error by about 2%, substantially outperforming baseline methods. The study also reveals both the potential and limitations of current domain adaptation techniques in multimodal fusion, robustness, and federated settings—evidenced by a marginal 0.3% improvement in skin cancer classification.
📝 Abstract
Domain adaptation (DA) is a quickly expanding area in machine learning that involves adjusting a model trained in one domain to perform well in another domain. While there have been notable progressions, the fundamental concept of numerous DA methodologies has persisted: aligning the data from various domains into a shared feature space. In this space, knowledge acquired from labeled source data can improve the model training on target data that lacks sufficient labels. In this study, we demonstrate the use of 10 deep learning models to simulate common DA techniques and explore their application in four medical image datasets. We have considered various situations such as multi-modality, noisy data, federated learning (FL), interpretability analysis, and classifier calibration. The experimental results indicate that using DA with ResNet34 in a brain tumor (BT) data set results in an enhancement of 4.7% in model performance. Similarly, the use of DA can reduce the impact of Gaussian noise, as it provides ∼3% accuracy increase using ResNet34 on a BT dataset. Furthermore, simply introducing DA into FL framework shows limited potential (e.g., ∼ 0.3% increase in performance) for skin cancer classification. In addition, the DA method can improve the interpretability of the models using the gradcam++ technique, which offers clinical values. Calibration analysis also demonstrates that using DA provides a lower expected calibration error (ECE) value ∼2% compared to CNN alone on a multi-modality dataset. The codes for our experiments are available at https://github.com/AIPMLab/Domain_Adaptation.