Synthetic Data-Driven Multi-Architecture Framework for Automated Polyp Segmentation Through Integrated Detection and Mask Generation

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the scarcity of medical data and high annotation costs in colonoscopy image polyp segmentation, this paper proposes an end-to-end framework integrating synthetic data generation with multi-model collaboration. We innovatively leverage Stable Diffusion to synthesize high-fidelity polyp images, alleviating the small-sample bottleneck. A cascaded detection-segmentation pipeline is established, combining Faster R-CNN—optimized for high-recall polyp localization (Recall: 93.08%)—with the Segment Anything Model (SAM) for precise mask generation. Systematic benchmarking across five segmentation architectures (U-Net, FPN, LinkNet, etc.) reveals LinkNet achieves the best overall performance in segmentation metrics (IoU: 64.20%; Dice: 77.53%), while FPN excels in reconstruction quality (PSNR/SSIM). The proposed framework significantly enhances model generalization and segmentation robustness under limited training data.

Technology Category

Application Category

📝 Abstract

Colonoscopy is a vital tool for the early diagnosis of colorectal cancer, which is one of the main causes of cancer-related mortality globally; hence, it is deemed an essential technique for the prevention and early detection of colorectal cancer. The research introduces a unique multidirectional architectural framework to automate polyp detection within colonoscopy images while helping resolve limited healthcare dataset sizes and annotation complexities. The research implements a comprehensive system that delivers synthetic data generation through Stable Diffusion enhancements together with detection and segmentation algorithms. This detection approach combines Faster R-CNN for initial object localization while the Segment Anything Model (SAM) refines the segmentation masks. The faster R-CNN detection algorithm achieved a recall of 93.08% combined with a precision of 88.97% and an F1 score of 90.98%.SAM is then used to generate the image mask. The research evaluated five state-of-the-art segmentation models that included U-Net, PSPNet, FPN, LinkNet, and MANet using ResNet34 as a base model. The results demonstrate the superior performance of FPN with the highest scores of PSNR (7.205893) and SSIM (0.492381), while UNet excels in recall (84.85%) and LinkNet shows balanced performance in IoU (64.20%) and Dice score (77.53%).

Problem

Research questions and friction points this paper is trying to address.

Automating polyp detection in colonoscopy images

Addressing limited healthcare dataset sizes

Resolving annotation complexities in medical imaging

Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic data generation via Stable Diffusion

Combines Faster R-CNN with SAM segmentation

Evaluates multiple segmentation architectures including FPN

🔎 Similar Papers

No similar papers found.

Authors to Follow