Perceptual Evaluation of GANs and Diffusion Models for Generating X-rays

📅 2025-08-09

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Evaluating the perceptual realism and clinical utility of generative models—specifically GANs and diffusion models (DMs)—for synthesizing chest X-rays depicting four thoracic abnormalities (atelectasis, lung opacity, pleural effusion, enlarged cardiac silhouette). Method: We established a conditional generation benchmark using MIMIC-CXR and conducted a double-blind radiologist evaluation to assess both visual fidelity and diagnostic reliability. Contribution/Results: This is the first study to reveal complementary strengths: DMs achieve significantly higher overall perceptual realism, whereas GANs outperform in anatomy-specific discriminative tasks (e.g., cardiac silhouette delineation). We identified radiologist-relied visual cues—including texture coherence and boundary sharpness—and quantified the perceptual gap between synthetic and real images. Our work establishes a novel clinical-adaptivity evaluation paradigm for medical image synthesis and informs co-optimization strategies tailored to diagnostic requirements.

Technology Category

Application Category

📝 Abstract

Generative image models have achieved remarkable progress in both natural and medical imaging. In the medical context, these techniques offer a potential solution to data scarcity-especially for low-prevalence anomalies that impair the performance of AI-driven diagnostic and segmentation tools. However, questions remain regarding the fidelity and clinical utility of synthetic images, since poor generation quality can undermine model generalizability and trust. In this study, we evaluate the effectiveness of state-of-the-art generative models-Generative Adversarial Networks (GANs) and Diffusion Models (DMs)-for synthesizing chest X-rays conditioned on four abnormalities: Atelectasis (AT), Lung Opacity (LO), Pleural Effusion (PE), and Enlarged Cardiac Silhouette (ECS). Using a benchmark composed of real images from the MIMIC-CXR dataset and synthetic images from both GANs and DMs, we conducted a reader study with three radiologists of varied experience. Participants were asked to distinguish real from synthetic images and assess the consistency between visual features and the target abnormality. Our results show that while DMs generate more visually realistic images overall, GANs can report better accuracy for specific conditions, such as absence of ECS. We further identify visual cues radiologists use to detect synthetic images, offering insights into the perceptual gaps in current models. These findings underscore the complementary strengths of GANs and DMs and point to the need for further refinement to ensure generative models can reliably augment training datasets for AI diagnostic systems.

Problem

Research questions and friction points this paper is trying to address.

Evaluating GANs and DMs for synthetic X-ray generation quality

Assessing clinical utility of synthetic images for AI diagnostics

Identifying perceptual gaps in generative models for medical imaging

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates GANs and Diffusion Models for X-rays

Compares synthetic and real images via radiologists

Identifies visual cues for synthetic image detection

🔎 Similar Papers

No similar papers found.