🤖 AI Summary
Low-dose CT (LDCT) images suffer from increased noise and blurred anatomical details due to radiation dose reduction; existing enhancement methods often over-smooth, compromising lesion visibility and structural fidelity. To address this, we propose a vision-perception-driven LDCT enhancement framework. It introduces the Vision Dual-path Extractor (ViDex), which emulates human contrast sensitivity by jointly modeling global long-range dependencies and local texture via hybrid state-space representations. Furthermore, we design a depth-aware correlation loss grounded in DINOv2-derived semantic priors to guide multi-scale feature reconstruction with anatomical semantics. Evaluated on the Mayo2016 dataset, our method achieves state-of-the-art performance—significantly improving lesion conspicuity and clinical interpretability while preserving structural accuracy. This yields perceptually superior LDCT images that support reliable low-dose diagnostic decision-making.
📝 Abstract
Low Dose Computed Tomography (LDCT) is widely used as an imaging solution to aid diagnosis and other clinical tasks. However, this comes at the price of a deterioration in image quality due to the low dose of radiation used to reduce the risk of secondary cancer development. While some efficient methods have been proposed to enhance LDCT quality, many overestimate noise and perform excessive smoothing, leading to a loss of critical details. In this paper, we introduce D-PerceptCT, a novel architecture inspired by key principles of the Human Visual System (HVS) to enhance LDCT images. The objective is to guide the model to enhance or preserve perceptually relevant features, thereby providing radiologists with CT images where critical anatomical structures and fine pathological details are perceptu- ally visible. D-PerceptCT consists of two main blocks: 1) a Visual Dual-path Extractor (ViDex), which integrates semantic priors from a pretrained DINOv2 model with local spatial features, allowing the network to incorporate semantic-awareness during enhancement; (2) a Global-Local State-Space block that captures long-range information and multiscale features to preserve the important structures and fine details for diagnosis. In addition, we propose a novel deep perceptual loss, designated as the Deep Perceptual Relevancy Loss Function (DPRLF), which is inspired by human contrast sensitivity, to further emphasize perceptually important features. Extensive experiments on the Mayo2016 dataset demonstrate the effectiveness of D-PerceptCT method for LDCT enhancement, showing better preservation of structural and textural information within LDCT images compared to SOTA methods.