🤖 AI Summary
Medical image deployment across institutions often suffers from degraded domain adaptation performance due to distribution shift and class-imbalance mismatch between source and target domains; existing methods struggle when target-domain class proportions are unknown or differ significantly. This paper proposes a weakly supervised domain adaptation framework that leverages only coarse-grained, prior knowledge of target-class proportions—without requiring pixel-level annotations. We introduce a proportion-constrained pseudo-labeling mechanism that explicitly models and corrects class-distribution bias during training. Specifically, pseudo-labels are generated under proportion guidance and reinforced via consistency regularization to enhance cross-domain generalization. Evaluated on two endoscopic datasets, our method achieves substantial improvements over state-of-the-art semi-supervised domain adaptation approaches using only 5% target-domain labeled data. Moreover, it demonstrates strong robustness to noise in class-proportion estimation.
📝 Abstract
Domain shift is a significant challenge in machine learning, particularly in medical applications where data distributions differ across institutions due to variations in data collection practices, equipment, and procedures. This can degrade performance when models trained on source domain data are applied to the target domain. Domain adaptation methods have been widely studied to address this issue, but most struggle when class proportions between the source and target domains differ. In this paper, we propose a weakly-supervised domain adaptation method that leverages class proportion information from the target domain, which is often accessible in medical datasets through prior knowledge or statistical reports. Our method assigns pseudo-labels to the unlabeled target data based on class proportion (called proportion-constrained pseudo-labeling), improving performance without the need for additional annotations. Experiments on two endoscopic datasets demonstrate that our method outperforms semi-supervised domain adaptation techniques, even when 5% of the target domain is labeled. Additionally, the experimental results with noisy proportion labels highlight the robustness of our method, further demonstrating its effectiveness in real-world application scenarios.