🤖 AI Summary
AI models for chest X-ray analysis often suffer from shortcut learning, relying on spurious features that compromise clinical specificity. To address this, we propose RoentMod—a framework enabling anatomy-preserving, controllable pathological editing without model retraining. RoentMod integrates the open-source generator RoentGen with an image-to-image translation model to synthesize counterfactual chest X-rays containing user-specified pathologies. These counterfactual images expose and mitigate shortcut behaviors in both multi-task and foundation models. Furthermore, they support counterfactual data augmentation, enhancing model robustness and interpretability. Internal validation shows AUC improvements of 3–19%; in external testing, AUC increased by 1–11% for 5 out of 6 pathologies. Radiologist evaluation confirms high realism and accurate pathological localization of the edited images.
📝 Abstract
Chest radiographs (CXRs) are among the most common tests in medicine. Automated image interpretation may reduce radiologists' workload and expand access to diagnostic expertise. Deep learning multi-task and foundation models have shown strong performance for CXR interpretation but are vulnerable to shortcut learning, where models rely on spurious and off-target correlations rather than clinically relevant features to make decisions. We introduce RoentMod, a counterfactual image editing framework that generates anatomically realistic CXRs with user-specified, synthetic pathology while preserving unrelated anatomical features of the original scan. RoentMod combines an open-source medical image generator (RoentGen) with an image-to-image modification model without requiring retraining. In reader studies with board-certified radiologists and radiology residents, RoentMod-produced images appeared realistic in 93% of cases, correctly incorporated the specified finding in 89-99% of cases, and preserved native anatomy comparable to real follow-up CXRs. Using RoentMod, we demonstrate that state-of-the-art multi-task and foundation models frequently exploit off-target pathology as shortcuts, limiting their specificity. Incorporating RoentMod-generated counterfactual images during training mitigated this vulnerability, improving model discrimination across multiple pathologies by 3-19% AUC in internal validation and by 1-11% for 5 out of 6 tested pathologies in external testing. These findings establish RoentMod as a broadly applicable tool for probing and correcting shortcut learning in medical AI. By enabling controlled counterfactual interventions, RoentMod enhances the robustness and interpretability of CXR interpretation models and provides a generalizable strategy for improving foundation models in medical imaging.