Synthetic Data-Driven Multi-Architecture Framework for Automated Polyp Segmentation Through Integrated Detection and Mask Generation

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of medical data and high annotation costs in colonoscopy image polyp segmentation, this paper proposes an end-to-end framework integrating synthetic data generation with multi-model collaboration. We innovatively leverage Stable Diffusion to synthesize high-fidelity polyp images, alleviating the small-sample bottleneck. A cascaded detection-segmentation pipeline is established, combining Faster R-CNN—optimized for high-recall polyp localization (Recall: 93.08%)—with the Segment Anything Model (SAM) for precise mask generation. Systematic benchmarking across five segmentation architectures (U-Net, FPN, LinkNet, etc.) reveals LinkNet achieves the best overall performance in segmentation metrics (IoU: 64.20%; Dice: 77.53%), while FPN excels in reconstruction quality (PSNR/SSIM). The proposed framework significantly enhances model generalization and segmentation robustness under limited training data.

Technology Category

Application Category

📝 Abstract
Colonoscopy is a vital tool for the early diagnosis of colorectal cancer, which is one of the main causes of cancer-related mortality globally; hence, it is deemed an essential technique for the prevention and early detection of colorectal cancer. The research introduces a unique multidirectional architectural framework to automate polyp detection within colonoscopy images while helping resolve limited healthcare dataset sizes and annotation complexities. The research implements a comprehensive system that delivers synthetic data generation through Stable Diffusion enhancements together with detection and segmentation algorithms. This detection approach combines Faster R-CNN for initial object localization while the Segment Anything Model (SAM) refines the segmentation masks. The faster R-CNN detection algorithm achieved a recall of 93.08% combined with a precision of 88.97% and an F1 score of 90.98%.SAM is then used to generate the image mask. The research evaluated five state-of-the-art segmentation models that included U-Net, PSPNet, FPN, LinkNet, and MANet using ResNet34 as a base model. The results demonstrate the superior performance of FPN with the highest scores of PSNR (7.205893) and SSIM (0.492381), while UNet excels in recall (84.85%) and LinkNet shows balanced performance in IoU (64.20%) and Dice score (77.53%).
Problem

Research questions and friction points this paper is trying to address.

Automating polyp detection in colonoscopy images
Addressing limited healthcare dataset sizes
Resolving annotation complexities in medical imaging
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic data generation via Stable Diffusion
Combines Faster R-CNN with SAM segmentation
Evaluates multiple segmentation architectures including FPN
🔎 Similar Papers
No similar papers found.
O
Ojonugwa Oluwafemi Ejiga Peter
Department of Advanced Computing, Morgan State University, Baltimore, Maryland, USA
A
Akingbola Oluwapemiisin
Department of Computer and Electrical Engineering, Morgan State University, Baltimore, Maryland, USA
A
Amalahu Chetachi
Department of Advanced Computing, Morgan State University, Baltimore, Maryland, USA
A
Adeniran Opeyemi
Department of Computer and Electrical Engineering, Morgan State University, Baltimore, Maryland, USA
Fahmi Khalifa
Fahmi Khalifa
Assistant Professor
Medical Image AnalysisMachine Learning/ Artificial IntelligencePattern RecognitionImage and
Md Mahmudur Rahman
Md Mahmudur Rahman
Department of Advanced Computing, Morgan State University, Baltimore, Maryland, USA