Lightweight Relational Embedding in Task-Interpolated Few-Shot Networks for Enhanced Gastrointestinal Disease Classification

📅 2024-06-25

🏛️ Conference on Algebraic Informatics

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the challenges of poor endoscopic image quality, scarce annotated samples, and high inter-frame similarity in early colorectal cancer (CRC) screening—leading to suboptimal diagnostic accuracy—this paper proposes a lightweight few-shot learning framework. Methodologically, we introduce a novel task interpolation strategy to enhance multi-view image generation; design a lightweight relational embedding module to jointly model intra-frame salient features and inter-frame dynamic evolution; and incorporate a two-level routing attention mechanism to strengthen focus on lesion regions. The model employs a lightweight CNN backbone, integrating few-shot learning, task interpolation, relational embedding, and two-level routing attention. Evaluated on the Kvasir dataset, it achieves 90.1% accuracy, 0.845 precision, 0.942 recall, and 0.891 F1-score—significantly surpassing current state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

Traditional diagnostic methods like colonoscopy are invasive yet critical tools necessary for accurately diagnosing colorectal cancer (CRC). Detection of CRC at early stages is crucial for increasing patient survival rates. However, colonoscopy is dependent on obtaining adequate and high-quality endoscopic images. Prolonged invasive procedures are inherently risky for patients, while suboptimal or insufficient images hamper diagnostic accuracy. These images, typically derived from video frames, often exhibit similar patterns, posing challenges in discrimination. To overcome these challenges, we propose a novel Deep Learning network built on a Few-Shot Learning architecture, which includes a tailored feature extractor, task interpolation, relational embedding, and a bi-level routing attention mechanism. The Few-Shot Learning paradigm enables our model to rapidly adapt to unseen fine-grained endoscopic image patterns, and the task interpolation augments the insufficient images artificially from varied instrument viewpoints. Our relational embedding approach discerns critical intra-image features and captures inter-image transitions between consecutive endoscopic frames, overcoming the limitations of Convolutional Neural Networks (CNNs). The integration of a light-weight attention mechanism ensures a concentrated analysis of pertinent image regions. By training on diverse datasets, the model’s generalizability and robustness are notably improved for handling endoscopic images. Evaluated on Kvasir dataset, our model demonstrated superior performance, achieving an accuracy of 90.1%, precision of 0.845, recall of 0.942, and an F1 score of 0.891. This surpasses current state-of-the-art methods, presenting a promising solution to the challenges of invasive colonoscopy by optimizing CRC detection through advanced image analysis.

Problem

Research questions and friction points this paper is trying to address.

Improving early colorectal cancer detection using endoscopic images

Overcoming limitations of invasive colonoscopy with Few-Shot Learning

Enhancing diagnostic accuracy via lightweight relational embedding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Few-Shot Learning for rapid adaptation

Task interpolation augments insufficient images

Light-weight attention mechanism improves focus

🔎 Similar Papers

No similar papers found.

Authors to Follow