🤖 AI Summary
Motion artifacts in 2D brain MRI frequently compromise diagnostic accuracy, necessitating robust correction methods. Method: This study systematically evaluates the clinical robustness of denoising diffusion probabilistic models (DDPMs) for motion artifact correction on a BraTS-derived benchmark, comparing them against supervised U-Nets across axial, coronal, and sagittal planes and under heterogeneous multi-source data conditions. Contribution/Results: We uncover a critical duality in DDPM behavior: while achieving higher reconstruction fidelity than U-Nets in certain settings, DDPMs consistently generate diagnostically harmful hallucinations—particularly under high data distribution shifts or non-axial acquisitions. Key findings identify imaging plane and data distribution shift as primary determinants of model instability. Overall, current DDPMs exhibit inferior clinical robustness compared to supervised methods, underscoring significant deployment risks. Our work provides both a rigorous evaluation framework and essential risk-aware guidance for clinical translation of generative models in medical imaging.
📝 Abstract
Magnetic Resonance Imaging generally requires long exposure times, while being sensitive to patient motion, resulting in artifacts in the acquired images, which may hinder their diagnostic relevance. Despite research efforts to decrease the acquisition time, and designing efficient acquisition sequences, motion artifacts are still a persistent problem, pushing toward the need for the development of automatic motion artifact correction techniques. Recently, diffusion models have been proposed as a solution for the task at hand. While diffusion models can produce high-quality reconstructions, they are also susceptible to hallucination, which poses risks in diagnostic applications. In this study, we critically evaluate the use of diffusion models for correcting motion artifacts in 2D brain MRI scans. Using a popular benchmark dataset, we compare a diffusion model-based approach with state-of-the-art methods consisting of Unets trained in a supervised fashion on motion-affected images to reconstruct ground truth motion-free images. Our findings reveal mixed results: diffusion models can produce accurate predictions or generate harmful hallucinations in this context, depending on data heterogeneity and the acquisition planes considered as input.