π€ AI Summary
This work proposes a blind self-supervised enhancement framework for ultrasound images, which are commonly degraded by speckle noise, point spread function (PSF) blur, and device-specific artifacts. Unlike conventional supervised methods that rely on hard-to-obtain clean labels or precise degradation models, the proposed approach integrates physics-guided degradation modeling with self-supervised learning. It constructs degraded inputs using Gaussian PSF blurring, spatial additive Gaussian noise, and complex Fourier-domain perturbations, while generating pseudo-labels via non-local low-rank denoisingβall without ground-truth references. A Swin Convolutional U-Net is then trained on single-frame images to learn reconstruction. The method achieves state-of-the-art PSNR and SSIM across multiple ultrasound datasets, outperforming models like MSANN and Restormer by 1β5 dB in PSNR and 0.05β0.20 in SSIM under severe noise, and significantly improves Dice scores for fetal head circumference and pubic symphysis segmentation, demonstrating its effectiveness as a plug-and-play preprocessor for downstream tasks.
π Abstract
Ultrasound (US) interpretation is hampered by multiplicative speckle, acquisition blur from the point-spread function (PSF), and scanner- and operator-dependent artifacts. Supervised enhancement methods assume access to clean targets or known degradations; conditions rarely met in practice. We present a blind, self-supervised enhancement framework that jointly deconvolves and denoises B-mode images using a Swin Convolutional U-Net trained with a \emph{physics-guided} degradation model. From each training frame, we extract rotated/cropped patches and synthesize inputs by (i) convolving with a Gaussian PSF surrogate and (ii) injecting noise via either spatial additive Gaussian noise or complex Fourier-domain perturbations that emulate phase/magnitude distortions. For US scans, clean-like targets are obtained via non-local low-rank (NLLR) denoising, removing the need for ground truth; for natural images, the originals serve as targets. Trained and validated on UDIAT~B, JNU-IFM, and XPIE Set-P, and evaluated additionally on a 700-image PSFHS test set, the method achieves the highest PSNR/SSIM across Gaussian and speckle noise levels, with margins that widen under stronger corruption. Relative to MSANN, Restormer, and DnCNN, it typically preserves an extra $\sim$1--4\,dB PSNR and 0.05--0.15 SSIM in heavy Gaussian noise, and $\sim$2--5\,dB PSNR and 0.05--0.20 SSIM under severe speckle. Controlled PSF studies show reduced FWHM and higher peak gradients, evidence of resolution recovery without edge erosion. Used as a plug-and-play preprocessor, it consistently boosts Dice for fetal head and pubic symphysis segmentation. Overall, the approach offers a practical, assumption-light path to robust US enhancement that generalizes across datasets, scanners, and degradation types.