SurgPLAN++: Universal Surgical Phase Localization Network for Online and Offline Inference

📅 2024-09-19
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing surgical phase recognition methods predominantly perform online frame-level classification, lacking global temporal modeling—resulting in temporally incoherent predictions and inability to support offline clinical review requiring full-video analysis. To address this, we propose TPL-Net, a Temporal Phase Localization Network, the first framework to model surgical phases as continuous temporal segments rather than isolated frames, enabling unified online and offline operation. Methodologically: (1) we formulate surgical phase recognition as an end-to-end temporal object detection task; (2) we introduce a pseudo-full-video augmentation strategy to maintain online inference efficiency; and (3) we design a global proposal generation and iterative refinement mechanism to enhance offline accuracy. Evaluated on multiple benchmarks, TPL-Net consistently outperforms state-of-the-art methods in both online and offline settings, yielding significantly more temporally coherent phase predictions and more precise phase boundary localization.

Technology Category

Application Category

📝 Abstract
Surgical phase recognition is critical for assisting surgeons in understanding surgical videos. Existing studies focused more on online surgical phase recognition, by leveraging preceding frames to predict the current frame. Despite great progress, they formulated the task as a series of frame-wise classification, which resulted in a lack of global context of the entire procedure and incoherent predictions. Moreover, besides online analysis, accurate offline surgical phase recognition is also in significant clinical need for retrospective analysis, and existing online algorithms do not fully analyze the entire video, thereby limiting accuracy in offline analysis. To overcome these challenges and enhance both online and offline inference capabilities, we propose a universal Surgical Phase Localization Network, named SurgPLAN++, with the principle of temporal detection. To ensure a global understanding of the surgical procedure, we devise a phase localization strategy for SurgPLAN++ to predict phase segments across the entire video through phase proposals. For online analysis, to generate high-quality phase proposals, SurgPLAN++ incorporates a data augmentation strategy to extend the streaming video into a pseudo-complete video through mirroring, center-duplication, and down-sampling. For offline analysis, SurgPLAN++ capitalizes on its global phase prediction framework to continuously refine preceding predictions during each online inference step, thereby significantly improving the accuracy of phase recognition. We perform extensive experiments to validate the effectiveness, and our SurgPLAN++ achieves remarkable performance in both online and offline modes, which outperforms state-of-the-art methods. The source code is available at https://github.com/franciszchen/SurgPLAN-Plus.
Problem

Research questions and friction points this paper is trying to address.

Enhances surgical phase recognition accuracy.
Addresses global context in surgical videos.
Improves both online and offline inference.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal detection principle
Phase localization strategy
Data augmentation technique
🔎 Similar Papers
No similar papers found.
Z
Zhen Chen
Centre for Artificial Intelligence and Robotics, Hong Kong Institute of Science and Innovation, Chinese Academy of Sciences
X
Xingjian Luo
Centre for Artificial Intelligence and Robotics, Hong Kong Institute of Science and Innovation, Chinese Academy of Sciences
Jinlin Wu
Jinlin Wu
Institute of Automation,Chinese Academy of Sciences
Long Bai
Long Bai
Research Assistant, Institute of Computing Technology, Chinese Academy of Sciences
Event-Centric AnalysisKnowledge GraphNatural Language Processing
Zhen Lei
Zhen Lei
Associate Professor, OSCO Research Chair in Off-site Construction
Offsite ConstructionConstruction Engineering and Management
Hongliang Ren
Hongliang Ren
Chinese University of Hong Kong | National University of Singapore | JHU/Harvard(RF) | CUHK(PhD)
Biorobotics & intelligent systemsmedical mechatronicscontinuumsoft flexible robots/sensorsmultisensory perception
S
Sébastien Ourselin
King’s College London
H
Hongbin Liu
Centre for Artificial Intelligence and Robotics, Hong Kong Institute of Science and Innovation, Chinese Academy of Sciences; Institute of Automation, Chinese Academy of Sciences