Adaptive transfer learning for surgical tool presence detection in laparoscopic videos through gradual freezing fine-tuning

๐Ÿ“… 2025-10-17
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the weak generalization of surgical instrument presence detection models in laparoscopic videos due to scarce annotated data, this paper proposes a two-stage adaptive fine-tuning method. First, linear probing rapidly assesses feature transferability; second, a progressive freezing mechanism dynamically freezes lower-layer parameters while fine-tuning only higher-level layersโ€”enabling efficient domain adaptation in a single training pass. This approach significantly reduces computational overhead and mitigates overfitting, enhancing few-shot domain adaptation efficiency. Evaluated on the Cholec80 dataset using ImageNet-pretrained ResNet-50 and DenseNet-121, our method achieves 96.4% mAP, surpassing state-of-the-art approaches. Cross-modal generalization is further validated on the CATARACTS dataset, demonstrating robustness and broad applicability across diverse laparoscopic video domains.

Technology Category

Application Category

๐Ÿ“ Abstract
Minimally invasive surgery can benefit significantly from automated surgical tool detection, enabling advanced analysis and assistance. However, the limited availability of annotated data in surgical settings poses a challenge for training robust deep learning models. This paper introduces a novel staged adaptive fine-tuning approach consisting of two steps: a linear probing stage to condition additional classification layers on a pre-trained CNN-based architecture and a gradual freezing stage to dynamically reduce the fine-tunable layers, aiming to regulate adaptation to the surgical domain. This strategy reduces network complexity and improves efficiency, requiring only a single training loop and eliminating the need for multiple iterations. We validated our method on the Cholec80 dataset, employing CNN architectures (ResNet-50 and DenseNet-121) pre-trained on ImageNet for detecting surgical tools in cholecystectomy endoscopic videos. Our results demonstrate that our method improves detection performance compared to existing approaches and established fine-tuning techniques, achieving a mean average precision (mAP) of 96.4%. To assess its broader applicability, the generalizability of the fine-tuning strategy was further confirmed on the CATARACTS dataset, a distinct domain of minimally invasive ophthalmic surgery. These findings suggest that gradual freezing fine-tuning is a promising technique for improving tool presence detection in diverse surgical procedures and may have broader applications in general image classification tasks.
Problem

Research questions and friction points this paper is trying to address.

Detecting surgical tools in laparoscopic videos automatically
Overcoming limited annotated data for surgical deep learning
Improving tool detection across diverse surgical procedure types
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradual freezing fine-tuning reduces network complexity dynamically
Linear probing conditions new layers on pre-trained CNN architecture
Single training loop improves efficiency without multiple iterations
๐Ÿ”Ž Similar Papers
No similar papers found.