Parsimonious Dataset Construction for Laparoscopic Cholecystectomy Structure Segmentation

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

High annotation costs severely hinder semantic segmentation of anatomical structures in laparoscopic cholecystectomy videos. To address this, we propose an active learning–driven dataset construction framework specifically designed for surgical video data. This work is the first to deeply integrate active learning into both frame sampling and annotation workflows for surgical videos, and systematically validates deep feature distance as the optimal uncertainty metric. Our method achieves 99.4% of the full-dataset segmentation performance (mIoU = 0.4349 vs. 0.4374) using only 50% of annotated frames, substantially improving annotation efficiency and model generalizability. Key contributions include: (1) establishing a lightweight, efficient annotation paradigm tailored to surgical videos; (2) rigorously identifying deep feature distance as the most effective uncertainty estimator in active learning for this domain; and (3) providing a reproducible, generalizable technical pathway for low-resource medical image segmentation.

Technology Category

Application Category

📝 Abstract

Labeling has always been expensive in the medical context, which has hindered related deep learning application. Our work introduces active learning in surgical video frame selection to construct a high-quality, affordable Laparoscopic Cholecystectomy dataset for semantic segmentation. Active learning allows the Deep Neural Networks (DNNs) learning pipeline to include the dataset construction workflow, which means DNNs trained by existing dataset will identify the most informative data from the newly collected data. At the same time, DNNs' performance and generalization ability improve over time when the newly selected and annotated data are included in the training data. We assessed different data informativeness measurements and found the deep features distances select the most informative data in this task. Our experiments show that with half of the data selected by active learning, the DNNs achieve almost the same performance with 0.4349 mean Intersection over Union (mIoU) compared to the same DNNs trained on the full dataset (0.4374 mIoU) on the critical anatomies and surgical instruments.

Problem

Research questions and friction points this paper is trying to address.

Reducing labeling costs for medical semantic segmentation

Active learning for surgical video frame selection

Improving DNN performance with minimal annotated data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Active learning for surgical video frame selection

Deep features distances measure data informativeness

Half data achieves near-full performance

🔎 Similar Papers

CholecInstanceSeg: A Tool Instance Segmentation Dataset for Laparoscopic Surgery