🤖 AI Summary
To address anatomical structure distortion in unpaired cross-modal medical image translation, this paper proposes the Spatially Consistent Guidance Diffusion (SCGD) model with anatomical contour constraints. SCGD enforces domain-agnostic anatomical contours as hard constraints at each denoising step, ensuring strict preservation of original anatomical topology and spatial consistency during CT-to-MRI translation. It introduces a novel zero-shot transfer mechanism, requiring no target-domain data for training. Integrated with unpaired adversarial learning and foreground-focused FID/KID evaluation, SCGD achieves substantial improvements over state-of-the-art methods on lumbar spine and hip-femur translation tasks: segmentation models trained on translated images yield >8% Dice score gain on real MRI; foreground FID and KID decrease by 32% and 41%, respectively.
📝 Abstract
Accurately translating medical images between different modalities, such as Computed Tomography (CT) to Magnetic Resonance Imaging (MRI), has numerous downstream clinical and machine learning applications. While several methods have been proposed to achieve this, they often prioritize perceptual quality with respect to output domain features over preserving anatomical fidelity. However, maintaining anatomy during translation is essential for many tasks, e.g., when leveraging masks from the input domain to develop a segmentation model with images translated to the output domain. To address these challenges, we propose ContourDiff with Spatially Coherent Guided Diffusion (SCGD), a novel framework that leverages domain-invariant anatomical contour representations of images. These representations are simple to extract from images, yet form precise spatial constraints on their anatomical content. We introduce a diffusion model that converts contour representations of images from arbitrary input domains into images in the output domain of interest. By applying the contour as a constraint at every diffusion sampling step, we ensure the preservation of anatomical content. We evaluate our method on challenging lumbar spine and hip-and-thigh CT-to-MRI translation tasks, via (1) the performance of segmentation models trained on translated images applied to real MRIs, and (2) the foreground FID and KID of translated images with respect to real MRIs. Our method outperforms other unpaired image translation methods by a significant margin across almost all metrics and scenarios. Moreover, it achieves this without the need to access any input domain information during training and we further verify its zero-shot capability, showing that a model trained on one anatomical region can be directly applied to unseen regions without retraining (GitHub: https://github.com/mazurowski-lab/ContourDiff).