Diverse Image Generation with Diffusion Models and Cross Class Label Learning for Polyp Classification

📅 2025-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of multimodal data, high cost of fine-grained annotation, and limited applicability of generative AI in histopathological classification of colorectal polyps, this paper proposes the first text-controllable multimodal synthetic image generation model tailored for colonoscopic imaging. Methodologically, we introduce a novel cross-class label learning mechanism that leverages heterogeneous samples to enhance feature representation and reduce annotation dependency; integrate text-prompt control for adenomatous and hyperplastic polyp synthesis under narrow-band and white-light imaging; and build upon diffusion models with cross-class feature distillation, multimodal conditional encoding, and video-level temporal consistency modeling. On public benchmarks, our method improves classification accuracy by 7.91% and video-level analytical performance by 18.33% (p < 0.01). The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Pathologic diagnosis is a critical phase in deciding the optimal treatment procedure for dealing with colorectal cancer (CRC). Colonic polyps, precursors to CRC, can pathologically be classified into two major types: adenomatous and hyperplastic. For precise classification and early diagnosis of such polyps, the medical procedure of colonoscopy has been widely adopted paired with various imaging techniques, including narrow band imaging and white light imaging. However, the existing classification techniques mainly rely on a single imaging modality and show limited performance due to data scarcity. Recently, generative artificial intelligence has been gaining prominence in overcoming such issues. Additionally, various generation-controlling mechanisms using text prompts and images have been introduced to obtain visually appealing and desired outcomes. However, such mechanisms require class labels to make the model respond efficiently to the provided control input. In the colonoscopy domain, such controlling mechanisms are rarely explored; specifically, the text prompt is a completely uninvestigated area. Moreover, the unavailability of expensive class-wise labels for diverse sets of images limits such explorations. Therefore, we develop a novel model, PathoPolyp-Diff, that generates text-controlled synthetic images with diverse characteristics in terms of pathology, imaging modalities, and quality. We introduce cross-class label learning to make the model learn features from other classes, reducing the burdensome task of data annotation. The experimental results report an improvement of up to 7.91% in balanced accuracy using a publicly available dataset. Moreover, cross-class label learning achieves a statistically significant improvement of up to 18.33% in balanced accuracy during video-level analysis. The code is available at https://github.com/Vanshali/PathoPolyp-Diff.
Problem

Research questions and friction points this paper is trying to address.

Generates diverse synthetic images for polyp classification
Reduces data annotation burden via cross-class label learning
Improves classification accuracy using text-controlled synthetic images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion models for diverse image generation
Cross-class label learning for feature enhancement
Text-controlled synthetic image creation for pathology
🔎 Similar Papers
No similar papers found.
Vanshali Sharma
Vanshali Sharma
Northwestern University
Medical Image AnalysisComputer VisionDeep Learning
Debesh Jha
Debesh Jha
University of South Dakota
Deep LearningBiomedical InformaticsMedical Image computingComputer visionAI for Medicine
M
M. K. Bhuyan
Department of Electronics & Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, 781039, India
P
Pradip K. Das
Department of Computer Science & Engineering, Indian Institute of Technology Guwahati, Guwahati, 781039, India
Ulas Bagci
Ulas Bagci
Northwestern University
artificial intelligencedeep learningbiomedical image analysismedical image computing