🤖 AI Summary
Existing public colonoscopy datasets (e.g., CVC-ClinicDB, Kvasir-SEG) suffer from limited sample size, stringent selection criteria, and insufficient representation of real-world artifacts. To address these limitations, we introduce ColonDx—the first high-quality benchmark dataset for polyp detection and segmentation tailored to resource-constrained clinical environments. ColonDx comprises 1,288 polyp-containing and 1,657 polyp-free images, explicitly capturing realistic clinical degradations including motion blur, specular reflection, and fecal residue. Leveraging this dataset, we establish dual-task baselines: classification (VGG16, ResNet50, InceptionV3) achieves 90.8% accuracy; segmentation (U-Net with VGG16, ResNet34, or InceptionV4 backbones) attains a Dice score of 0.64—demonstrating the dataset’s heightened realism and challenge. ColonDx fills a critical gap in publicly available, high-fidelity, clinically representative annotated data, providing a robust foundation for developing reliable computer-aided diagnosis systems under low-resource conditions.
📝 Abstract
Background and Objective: Colorectal cancer prevention relies on early detection of polyps during colonoscopy. Existing public datasets, such as CVC-ClinicDB and Kvasir-SEG, provide valuable benchmarks but are limited by small sample sizes, curated image selection, or lack of real-world artifacts. There remains a need for datasets that capture the complexity of clinical practice, particularly in resource-constrained settings. Methods: We introduce a dataset, BUET Polyp Dataset (BPD), of colonoscopy images collected using Olympus 170 and Pen- tax i-Scan series endoscopes under routine clinical conditions. The dataset contains images with corresponding expert-annotated binary masks, reflecting diverse challenges such as motion blur, specular highlights, stool artifacts, blood, and low-light frames. Annotations were manually reviewed by clinical experts to ensure quality. To demonstrate baseline performance, we provide bench- mark results for classification using VGG16, ResNet50, and InceptionV3, and for segmentation using UNet variants with VGG16, ResNet34, and InceptionV4 backbones. Results: The dataset comprises 1,288 images with polyps from 164 patients with corresponding ground-truth masks and 1,657 polyp-free images from 31 patients. Benchmarking experiments achieved up to 90.8% accuracy for binary classification (VGG16) and a maximum Dice score of 0.64 with InceptionV4-UNet for segmentation. Performance was lower compared to curated datasets, reflecting the real-world difficulty of images with artifacts and variable quality.