🤖 AI Summary
This work addresses the lack of selectivity in existing test-time adaptation methods for medical image segmentation, which often leads to tumor boundary overflow or corruption of correct predictions, thereby compromising clinical safety. To mitigate this, the authors propose a hypothesis-driven adaptation framework that generates two competing geometric hypotheses—“shrinkage” and “expansion”—and integrates them with a representation-guided selector and a pre-screening gating mechanism to enable a safe and dynamic decision process. Evaluated on cross-domain brain tumor MRI segmentation, the method achieves comparable Dice scores while significantly improving clinical reliability: Hausdorff Distance (HD95) is reduced by approximately 6.4 mm and Precision increases by over 4% compared to current approaches, demonstrating superior performance in both accuracy and clinical safety.
📝 Abstract
Standard Test-Time Adaptation (TTA) methods typically treat inference as a blind optimization task, applying generic objectives to all or filtered test samples. In safety-critical medical segmentation, this lack of selectivity often causes the tumor mask to spill into healthy brain tissue or degrades predictions that were already correct. We propose Hypothesis-Driven TTA, a novel framework that reformulates adaptation as a dynamic decision process. Rather than forcing a single optimization trajectory, our method generates intuitive competing geometric hypotheses: compaction (is the prediction noisy? trim artifacts) versus inflation (is the valid tumor under-segmented? safely inflate to recover). It then employs a representation-guided selector to autonomously identify the safest outcome based on intrinsic texture consistency. Additionally, a pre-screening Gatekeeper prevents negative transfer by skipping adaptation on confident cases. We validate this proof-of-concept on a cross-domain binary brain tumor segmentation task, applying a source model trained on adult BraTS gliomas to unseen pediatric and more challenging meningioma target domains. HD-TTA improves safety-oriented outcomes (Hausdorff Distance (HD95) and Precision) over several state-of-the-art representative baselines in the challenging safety regime, reducing the HD95 by approximately 6.4 mm and improving Precision by over 4%, while maintaining comparable Dice scores. These results demonstrate that resolving the safety-adaptation trade-off via explicit hypothesis selection is a viable, robust path for safe clinical model deployment. Code will be made publicly available upon acceptance.