SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model

📅 2025-07-07

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Addressing the challenge of generating high-fidelity, large-angle controllable multi-view images from a single-view X-ray, this paper proposes a weak-to-strong progressive synthesis framework based on the Diffusion Transformer. The method introduces a view-conditioning mechanism and a staged high-resolution diffusion training strategy to significantly enhance anatomical detail fidelity and precise view control. Compared with conventional GAN- or VAE-based approaches, our method overcomes key limitations—including narrow angular range, low resolution, and severe artifacts—enabling continuous ±60° multi-view generation on chest radiographs and orthopedic X-rays. Quantitatively, it achieves over 12% improvement in PSNR and SSIM, while supporting clinically interpretable, fine-grained structural reconstruction. Extensive experiments validate its practical utility in low-dose diagnostic assistance, medical education visualization, and few-shot data augmentation.

Technology Category

Application Category

📝 Abstract

X-ray imaging is a rapid and cost-effective tool for visualizing internal human anatomy. While multi-view X-ray imaging provides complementary information that enhances diagnosis, intervention, and education, acquiring images from multiple angles increases radiation exposure and complicates clinical workflows. To address these challenges, we propose a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Unlike prior methods, which are limited in angular range, resolution, and image quality, our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation. Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles. This capability has significant implications not only for clinical applications but also for medical education and data extension, enabling the creation of diverse, high-quality datasets for training and analysis. Our code is available at GitHub.

Problem

Research questions and friction points this paper is trying to address.

Synthesize multi-view X-ray images from single view

Overcome angular range and resolution limitations

Reduce radiation exposure and clinical workflow complexity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses view-conditioned diffusion model

Leverages Diffusion Transformer for details

Employs weak-to-strong training strategy

🔎 Similar Papers

ViewFusion: Learning Composable Diffusion Models for Novel View Synthesis